Thank you for the clarification. <rbo...@rcbowen.com> 于2023年3月17日周五 22:45写道:
> Hi. > > Unfortunately, you've reached the wrong list. This is a general > community list for the Apache Software Foundation as a whole. Tika has > its own lists, which you can find at > https://tika.apache.org/mail-lists.html and that's where the people > that can help with this hang out. > > --Rich > > On Fri, 2023-03-17 at 22:02 +0800, 朱桂锋 wrote: > > Firstly, thank you for tika project, she is great project! > > > > Recently, i run the tika project and extract text from document, i > > find > > java offheap is increasing until all the memory to the 100%, and then > > killed by oom-killer. > > > > then i use pmap and dump data from memory(exclude the java heap), i > > find > > they are like this: > > > > [ Content > > > > Types] . xM1PK > > > > rels/.relsPK word/ rels/document.xm1.relsPK word /document.xm1PK > > word/footer4.xmIPK word/header4. xm1PK word/footer2.xmIPK > > word/header2. > > xm1PK word /header3.xmIPK word/footer3.xmlPK word /header1.xm1PK > > > > word/ footer1 . xm1PK > > > > word / footnotes.xmlPK word/endnotes .xm1PK word/header5. xm1PK > > word/media/ > > image3.pngPK word/media/imagel. jpegPK word/media/image2. jpegPK word > > / > > theme/ theme 1. xm1PK word/settings. xm1PK > > > > customxml/ itemProps2 .xm1PK > > > > customXml /item2 . xm1PK docProps /custom. xm1 PK t?92 > > customXml/rels/item1.xm1.relsPK customXml/ rels/item2.xm1.relsPK > > customXm1 > > /itemProps1.xm1PK > > > > > > > > they are office document text,why they are in offheap? so i doubt > > when > > parse some special office document it will cause memory leak. > > > > And sorry i don't know what the special office document and i can't > > afford the sample. > > > > > > another infomation: when i debug code on my own mac computer, using > > xlsx > > sample , > > when it calling tika.detect, it called ZipArchiveInputStream > > constructor > > twice, and the same times calling java.util.zip.Inflater#end(); > > but when it calling tika.parseToString, it called > > ZipArchiveInputStream > > constructor once, but no times calling java.util.zip.Inflater#end(); > > > > Is that caused the offheap memory leak because of the Inflater use > > native > > code? > > > > Look forward for your reply! thank you very much! > >