JXL parsing problems with apache POI generated xls

The problem

We are using apache poi 3.8 to generate excels. Our excel generation is based on templates where we duplicate rows and columns as necessary to fit the data that is fed into them. In this particular usecase we add 23 rows and one column. Our counterparty uses jxl 2.5.5 (released 2005-05-05) to parse the incoming xls files generated by our poi setup. The added rows do not pose a problem but the added column does. The problem is that the values in the added column are not parsed and instead jxl outputs warnings:

Analysis

If you google for the warning you quickly arrive at the jxl.read.biff.SheetReader class as the culprit, more specifically the “addCell()” method. As you can see it will generate warnings for out of bound cells:

Reproducing the error

We set up a small testcase using jxl to parse the excel we sent the counterparty. At the core is this read() method:

I included the latest version of jxl available in the maven repository which is 2.6.12 (released 2009-12-26) at the time of writing. The warning did not occur and the additional column was parsed without a problem. I downgraded the jxl version to the oldest available in the repository: 2.5.7 (released 2005-07-30) and ran it again. This generated the warnings our counterparty was experiencing and resulted in a missing column.

Edit: once we received confirmation of the jxl version being used by our counterparty, we ran the test again with 2.5.5 which yielded the same results as 2.5.7

Solutions

Solution 1: Upgrading jxl version

The reason the newer version of jxl parses the excel correctly is because they updated the addCell() method to:

As you can see the excel generated by apache poi is still incorrect but jxl has learned to deal with it gracefully.

Solution 2: Fixing apache poi

Apache poi updates the dimensions of a sheet every time you add a row or a column. However there is a slight difference between how these additions are handled:
Row addition (org.apache.poi.hssf.model.InternalSheet.addRow():698):

Column addition (org.apache.poi.hssf.model.InternalSheet.addValueRecord():633):

We updated line 633 to:

And generated the excel again. This new excel was parsed correctly by both versions of jxl.
Note that to submit the bug to apache poi, it is best to have a simple code sample able to reproduce the problem. The following method will generate an excel that has the wrong dimensions:

This bug and the suggested fix have been logged in the apache bug tracker.

Author: Alexander Verbruggen

blogger

blogger

Curious to know more about this topic?

Working at i8c

i8c is a system integrator that strives for an informal atmosphere between its employees, who have an average age of approx 30 years old. We invest a lot of effort in the professional development of each individual, through a direct connection between the consultants and the management (no multiple layers of middle management). We are based in Kontich, near Antwerp, but our customers are mainly located in the triangle Ghent-Antwerp-Brussels and belong to the top 500 companies in Belgium (Securex, Electrabel, UCB, etc…).

Quality Assurance

i8c is committed to delivering quality services and providing customer satisfaction. That’s why we invested in the introduction of a Quality Management System, which resulted in our ISO9001:2000 certification. This guarantees that we will meet your expectations, as a reliable, efficient and mature partner for your SOA & integration projects.

i8c - ISO9001-2015

Also worth reading

AWS AppFlow: Streamlining SaaS Integrations with AWS Services

In today’s digital world, organizations are constantly looking for ways to streamline their workflows and improve their data management processes. One of the key challenges that organizations face is integrating their various software as a service (SaaS) applications with their data management systems. This is

Read More »

Apigee Scope Validation using OpenAPI Specification

In API security and management, we often use a lot of different security mechanisms to protect the requested resource behind the API Gateway. One of these mechanisms is the validation of scopes to authorize a client on a specific sub-resource of the API. Most of

Read More »