https://issues.apache.org/bugzilla/show_bug.cgi?id=57401
Bug ID: 57401
Summary: [PATCH] POI SharedStringsTable's MapDB implementation
to reduce memory footprint
Product: POI
Version: unspecified
Hardware: PC
Status: NEW
Severity: enhancement
Priority: P2
Component: XSSF
Assignee: [email protected]
Reporter: [email protected]
Created attachment 32333
--> https://issues.apache.org/bugzilla/attachment.cgi?id=32333&action=edit
attachment contains additional class and enum with tweek in existing code and
test cases , writing part of test case require less then -Xmx100M to write data
Problem : SXSSFWorkbook defaults to using inline strings instead of a shared
strings table.This is very efficient, since no document content needs to be
kept in memory, but is also known to produce documents that are incompatible
with some clients and work book size will be large.
SXSSFWorkbook with shared strings enabled all unique strings in the document
has to be kept in memory but it use a lot more resources than with shared
strings disabled.
Solution : To reduce memory footprint of POI’s shared strings table
implementation we implemented shared strings table usin MapDB.
Overall, the MapDB solution is slower than pure POI, but takes much lesser
amount of memory.
Attached patch
We couldn't so far find a clean way to achieve this without patching POI code.
To achieve this we have added SharedStringsTable type (Default or MapDB) to use
while constructing XSSFWorkbook through it's constructor which and overridden
write(OutputStream stream) method from from POIXMLDocument(by removing final
keyword from this method to override it).
DBMappedSharedStringsTable class extends from SharedStringsTable which have
logic to flows data to disk as per availability of memory.
Mirror of Apache POI
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]