Re: Zip bomb with BoilepipeContentHandler

2012-07-30 Thread Jukka Zitting
Hi, On Fri, Jul 27, 2012 at 3:38 PM, Marc-Daniel Ortega wrote: > Caused by: org.apache.tika.sax.SecureContentHandler$SecureSAXException: > Suspected zip bomb: 100 levels of XML element nesting This could be caused by BoilerPipe not closing elements properly. The BoilerPipeContentHandler class w

Re: Zip bomb with BoilepipeContentHandler

2012-07-30 Thread Marc-Daniel Ortega
Thx, I'll try my best to keep best of both worlds. I can supposedly parse different content types (pdf, rtf, doc, html,...) So work on the provided boilepipe handler will be needed so. Cheers On Mon, Jul 30, 2012 at 10:33 AM, Jukka Zitting wrote: > Hi, > > On Fri, Jul 27, 2012 at 3:38 PM, Marc

[ANNOUNCE] Welcome Sergey Beryozkin as Apache Tika PMC member and committer

2012-07-30 Thread Mattmann, Chris A (388J)
Hi Folks, The Tika PMC has elected to add Sergey Beryozkin as a PMC member and committer. Welcome Sergey! Feel free to say a bit about yourself! Cheers, Chris ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion L