Hi team
We are trying to read the data from office  documents like xlsx, xls, docx 
etc.,. But we are facing memory issues while reading OOXML file formatted 
files,  of large size(around 100 MB) using POI apis. For xls/xlsx formats there 
are event based APIs which solve the memory issue(XSSF/HSSF event based API). 
But for reading word files or ppt files, there are no event based APIs. We have 
to create XWPF/HWPF Document which consumes lot of memory , ex: for 45 MB DOCX 
file, the heap size to prepare XWPFDocument it's taking 12GB memory.

So similar to Xlsx files, is there any plan to provide event based apis for 
rest of office documents.?
And if there is any workaround to read the data with less memory consumption. 
Please let me know? Our use case is to just read the data.

Thanks
Chaitanya

Reply via email to