https://issues.apache.org/bugzilla/show_bug.cgi?id=53810
Priority: P2 Bug ID: 53810 Assignee: dev@poi.apache.org Summary: [PATCH] fix for incorrect loop detection in NPOIFS Severity: normal Classification: Unclassified OS: Mac OS X 10.4 Reporter: gk...@iongrid.com Hardware: PC Status: NEW Version: 3.8 Component: POIFS Product: POI While upgrading our application to use Tika 1.2 (previously Tika 0.9), a few PowerPoint 97-03 (PPT) files which previously parsed correctly started failing with exceptions in NPOIFS. The root cause appears to be a difference in the way that BAT entries are read from XBAT blocks between POIFSFileSystem and NPOIFSFileSystem. In POIFS, the header's getBATCount is used as a hard-limit for the number of BATs which are read; in NPOIFS, XBATEntriesPerBlock are read for every XBAT, even if this causes more total BAT entries to be read than header.getBATCount. In some files, the extraneous BAT blocks are all initialized to the same value, which is then detected as a possible cycle. The attached PPT file demonstrates this problem (it was found via a web-crawler search for test content, so I can not grant a license to Apache to redistribute it). The attached patch implements similar behavior in NPOIFS to what exists in POIFS, and allows the file to parse without exception. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org