[ https://issues.apache.org/jira/browse/TIKA-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15137110#comment-15137110 ]
Tim Allison edited comment on TIKA-741 at 2/8/16 3:55 PM: ---------------------------------------------------------- I'd recommend adding the following to your EnhancedPDF2XHTML: {noformat} @Override protected void writeParagraphStart() throws IOException { + super.writeParagraphStart(); {noformat} and {noformat} @Override protected void writeParagraphEnd() throws IOException { + super.writeParagraphEnd(); {noformat} Finally, if your modifications of our PDFParsers are enhancements that have general applicability, please, oh, please share them with us. was (Author: talli...@mitre.org): I'd recommend adding the following to your EnhancedPDF2XHTML: {{noformat}} @Override protected void writeParagraphStart() throws IOException { + super.writeParagraphStart(); {{noformat}} and {{noformat}} @Override protected void writeParagraphEnd() throws IOException { + super.writeParagraphEnd(); {{noformat}} Finally, if your modifications of our PDFParsers are enhancements that have general applicability, please, oh, please share them with us. > "Zip bomb" (XML nesting) detection is too strict > ------------------------------------------------ > > Key: TIKA-741 > URL: https://issues.apache.org/jira/browse/TIKA-741 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.10 > Reporter: Erik Hetzner > Assignee: Jukka Zitting > Priority: Minor > Fix For: 1.0 > > > I get "zip bomb" errors from many HTML documents, e.g. > http://www.akhbaar.org/wesima_articles/index-20100101-82736.html > Is there a way that the element nesting level could be made configurable? 30 > elements just doesn't seem to be enough. > Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332)