https://bz.apache.org/bugzilla/show_bug.cgi?id=69769

            Bug ID: 69769
           Summary: XSSFSheetXMLHandler throws NumberFormatException when
                    <v> in t="s" cell contains trailing whitespace
           Product: POI
           Version: 5.4.1-FINAL
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XSSF
          Assignee: dev@poi.apache.org
          Reporter: tim.he...@krause-schopp.de
  Target Milestone: ---

Created attachment 40079
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=40079&action=edit
Broken file -> xl/worksheets/sheet1.xml

When parsing .xlsx files using the SAX/Event API (XSSFSheetXMLHandler), Apache
POI assumes that the <v> value for cells with t="s" (shared string) is always a
valid integer index into sharedStrings.xml. However, if the value contains
trailing whitespace (e.g., "0 "), the parser throws a NumberFormatException
instead of trimming or handling the invalid whitespace gracefully.

While trailing whitespace is technically invalid per ECMA-376 OOXML
specification, other parts of POI (such as the WorkbookFactory / usermodel API)
are more tolerant and can handle such cases. The event model fails hard without
the possibility for client code to intercept and fix the value before parsing.

This makes POI less fault-tolerant for real-world files that may be generated
by third-party software or exported from non-Excel tools.

After opening the file in Excel, this error is corrected as the space is
removed. Unfortunately, this does not happen on our automated infrastructure.

Stacktrace:
10:11:35.577 [main] ERROR
org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler - Failed to parse SST
index '2 '
java.lang.NumberFormatException: For input string: "2 "
        at
java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67)
~[?:?]
        at java.base/java.lang.Integer.parseInt(Integer.java:662) ~[?:?]
        at java.base/java.lang.Integer.parseInt(Integer.java:778) ~[?:?]
        at
org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler.outputCell(XSSFSheetXMLHandler.java:422)
[poi-ooxml-5.4.1.jar:5.4.1]
        at
org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler.endElement(XSSFSheetXMLHandler.java:316)
[poi-ooxml-5.4.1.jar:5.4.1]
        at
java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:618)
[?:?]
        at
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1728)
[?:?]
        at
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2899)
[?:?]
        at
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:605)
[?:?]
        at
java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:114)
[?:?]
        at
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:542)
[?:?]
        at
java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:889)
[?:?]
        at
java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:825)
[?:?]
        at
java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
[?:?]
        at
java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1224)
[?:?]
        at
java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:637)
[?:?]

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org

Reply via email to