Hi, Anyone has time to review the patch submitted in https://issues.apache.org/bugzilla/show_bug.cgi?id=57401 ?
On Tue, Dec 30, 2014 at 2:57 PM, Sumedh <[email protected]> wrote: > Hi guys, > > We've submitted a patch to have an option to use MapDB based shared string > table instead of fully in-memory one. > https://issues.apache.org/bugzilla/show_bug.cgi?id=57401 > > There will be a new constructor to use this approach... > > public SXSSFWorkbook(XSSFWorkbook workbook, int rowAccessWindowSize, > boolean compressTmpFiles, *SharedStringsTableType sharedStringsTableType*) > > In our tests, it's performing pretty well in terms of memory footprint... > > Please take a look and provide feedback... > > > > > On Tue, Dec 16, 2014 at 3:01 PM, Nick Burch <[email protected]> wrote: > >> On Tue, 16 Dec 2014, Sumedh wrote: >> >>> 1. For a quick win, is it possible to provide a hook so that we can plug >>> in an overridden implementation of SharedStringTable class? As far as I >>> saw, there is no clean pluggability available right now (but I have very >>> little understanding of POI codebase). >>> >> >> We'd need to tweak things to allow that. However, is working at the CTRst >> level going to be good for you with MapDB or similar? Will serialising then >> deserialising those cause you lots of problems / overhead? Would there be a >> better "thing" to pass back and forth between XSSF / SXSSF / SAX code for a >> shared string? >> >> (There has been discussion lately about trying to avoid the amount of >> xmlbeans objects on public interfaces, so that a switch to something like >> jaxp could be done later if we want to, so this is one case when we can >> consider it) >> >> 2. If that works well, we can explore using MapDB as one of the options >>> to be used natively after considering all the other factors (like licensing >>> and size)...or may be some other smaller library focused only on this >>> aspect, or Alex's homegrown code. :) >>> >>> BTW, MapDB is free as speech and free as beer under Apache License 2.0 >>> <https://github.com/jankotek/MapDB/blob/master/doc/license.txt>. :) >>> - https://github.com/jankotek/MapDB/blob/master/license.txt >>> >> >> And small too, so I don't see any major issues with making it an option >> for people wanting lower memory but higher IO reading, assuming we can't >> find a better one (eg from Alex or Lucene!) >> >> >> Nick >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> > > > -- > Cheers, > Sumedh > http://www.linkedin.com/in/sumedhinamdar > Ph: +91 - 95610 99125 > -- Cheers, Sumedh http://www.linkedin.com/in/sumedhinamdar Ph: +91 - 95610 99125
