Hey Guys, Just an FYI my personal preference on things like this are to leave the original issue closed, open up a new issue and to link back to the original one. This is mainly from a release management perspective, where we may have already shipped a CHANGES.txt with a closed issue that gets re-opened later, which I think we should avoid.
Cheers, Chris On Nov 26, 2011, at 3:56 AM, Michael McCandless wrote: > Yes please go ahead and reopen TIKA-738... sounds like something is wrong! > > Thanks. > > Mike McCandless > > http://blog.mikemccandless.com > > On Fri, Nov 25, 2011 at 9:25 PM, John M <[email protected]> wrote: >> Hello, >> >> When I use the latest build of the Tika application jar's CLI with the >> -h option to parse testAnnotations.pdf (from the parsers' test >> documents folder), added in TIKA-738, the result has two "<p>" >> elements and three "</p>" elements. Attempting to open this file in >> the GUI also causes it to crash with a NPE--the same one described in >> TIKA-778. I see in issue PDFBox-1143 that the code introduced for >> TIKA-738 will go away once this PDFBox issue is resolved, but perhaps >> meanwhile PDF2XHTML.java should be modified to produce a different >> number of "</p>" elements: should one of the >> "handler.endElement("p");" lines be removed from the endPage method? >> >> Thanks, >> John Mastarone >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
