Good point, thanks! After an initial profiling, the vast majority of the time 
was spent on two things: 1) parsing the Excel file (parseSheet in XSSFWorkbook) 
and 2) running finalisers (java.lang.ref.Finalizer). 

I did create a cache that pools Workbook objects to avoid (1) above, which then 
also helped reduce (2). I will refine this a bit, and then post results for 
discussion!


Markus


> On 06 Jul 2016, at 13:59, Dominik Stadler <dominik.stad...@gmx.at> wrote:
> 
> Hi,
> 
> I would strongly suggest to do some profiling of your particular use-case
> first. Just starting some optimization might lead you in the wrong
> direction altogether. Only when you know where how much time is spent you
> can make an educated guess at where some change will actually provide an
> improvement.
> 
> Based on the results, we may be able to suggest things to make it run
> quicker or even be able to fix POI to do things in a better way altogether.
> 
> Dominik.
> 
> On Tue, Jul 5, 2016 at 7:21 PM, Blake Watson <blake.wat...@pnmac.com> wrote:
> 
>> 1) Ensure I only save relevant sheets/cells from the files (to speed up
>> retrieval/parsing)
>> 2) Override parsing in XSSFWorkbook to avoid unnecessary work such as
>> themes, styles etc
>> 3) Pool the workbooks to avoid creating them every time (even though I need
>> to be able to update them separately for every request)
>> 4) Something else :-)
>> 
>> I don't know about #1. Personally, I just save the output values, not the
>> sheets.
>> 
>> #2 seems like a whole lot of work, especially compared to #3.
>> 
>> #3 is what I do: Load the workbook on first request and re-use for
>> subsequent requests. If you know in advance what cells may be changed, you
>> can cache those values when you first load a workbook. Then you hand the
>> workbook out to a consumer, with the understanding that only those input
>> values are changed. When done you restore the inputs to their
>> original values. (And, in my case, I re-evaluate the workbook at that
>> point, to restore things as much as possible.)
>> 
>> #4? Well, profiling is good. A lot of times you think your bottleneck is
>> one thing and it turns out to be something unexpected entirely.
>> 
>> ===Blake===
>> ​
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Reply via email to