[jira] [Commented] (TIKA-2191) Apply current .docx unit tests to experimental SAX parser and fix or document as necessary

2016-12-14 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15749164#comment-15749164 ] Hudson commented on TIKA-2191: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1158 (See [h

[jira] [Updated] (TIKA-2191) Apply current .docx unit tests to experimental SAX parser and fix or document as necessary

2016-12-14 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-2191: -- Attachment: element_counts_ooxml-docx.xlsx I counted the elements in the main story .xml file (mostly doc

[jira] [Commented] (TIKA-2201) OutOfMemoryError on a reasonably sized document

2016-12-14 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15748196#comment-15748196 ] Tim Allison commented on TIKA-2201: --- To be fair, the pptx is 386 MB, with 124 slides, eac

[jira] [Updated] (TIKA-2202) StringIndexOutOfBoundsException on a valid Word document

2016-12-14 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-2202: -- Attachment: TIKA-2202.xml Extracted content with trunk > StringIndexOutOfBoundsException on a valid Word

[jira] [Resolved] (TIKA-2202) StringIndexOutOfBoundsException on a valid Word document

2016-12-14 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2202. --- Resolution: Cannot Reproduce Fix Version/s: 1.15 Can't reproduce this with trunk. IIRC, I fixed

[jira] [Commented] (TIKA-2208) Catch missing libraires

2016-12-14 Thread David Pilato (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15748161#comment-15748161 ] David Pilato commented on TIKA-2208: Looks like a good idea. Let me try it and come bac

[jira] [Commented] (TIKA-2208) Catch missing libraires

2016-12-14 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15748104#comment-15748104 ] Nick Burch commented on TIKA-2208: -- Rather than doing it in code, what happens if you spec

[jira] [Commented] (TIKA-2208) Catch missing libraires

2016-12-14 Thread Ryan Ernst (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15747668#comment-15747668 ] Ryan Ernst commented on TIKA-2208: -- Or even better, make the OOXMLParser ctor take which p

[jira] [Created] (TIKA-2208) Catch missing libraires

2016-12-14 Thread David Pilato (JIRA)
David Pilato created TIKA-2208: -- Summary: Catch missing libraires Key: TIKA-2208 URL: https://issues.apache.org/jira/browse/TIKA-2208 Project: Tika Issue Type: Improvement Components: