https://issues.apache.org/bugzilla/show_bug.cgi?id=53810

          Priority: P2
            Bug ID: 53810
          Assignee: dev@poi.apache.org
           Summary: [PATCH] fix for incorrect loop detection in NPOIFS
          Severity: normal
    Classification: Unclassified
                OS: Mac OS X 10.4
          Reporter: gk...@iongrid.com
          Hardware: PC
            Status: NEW
           Version: 3.8
         Component: POIFS
           Product: POI

While upgrading our application to use Tika 1.2 (previously Tika 0.9), a few
PowerPoint 97-03 (PPT) files which previously parsed correctly started failing
with exceptions in NPOIFS.

The root cause appears to be a difference in the way that BAT entries are read
from XBAT blocks between POIFSFileSystem and NPOIFSFileSystem. In POIFS, the
header's getBATCount is used as a hard-limit for the number of BATs which are
read; in NPOIFS, XBATEntriesPerBlock are read for every XBAT, even if this
causes more total BAT entries to be read than header.getBATCount. In some
files, the extraneous BAT blocks are all initialized to the same value, which
is then detected as a possible cycle.

The attached PPT file demonstrates this problem (it was found via a web-crawler
search for test content, so I can not grant a license to Apache to redistribute
it). The attached patch implements similar behavior in NPOIFS to what exists in
POIFS, and allows the file to parse without exception.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org

Reply via email to