Good point, thanks! After an initial profiling, the vast majority of the time was spent on two things: 1) parsing the Excel file (parseSheet in XSSFWorkbook) and 2) running finalisers (java.lang.ref.Finalizer).
I did create a cache that pools Workbook objects to avoid (1) above, which then also helped reduce (2). I will refine this a bit, and then post results for discussion! Markus > On 06 Jul 2016, at 13:59, Dominik Stadler <dominik.stad...@gmx.at> wrote: > > Hi, > > I would strongly suggest to do some profiling of your particular use-case > first. Just starting some optimization might lead you in the wrong > direction altogether. Only when you know where how much time is spent you > can make an educated guess at where some change will actually provide an > improvement. > > Based on the results, we may be able to suggest things to make it run > quicker or even be able to fix POI to do things in a better way altogether. > > Dominik. > > On Tue, Jul 5, 2016 at 7:21 PM, Blake Watson <blake.wat...@pnmac.com> wrote: > >> 1) Ensure I only save relevant sheets/cells from the files (to speed up >> retrieval/parsing) >> 2) Override parsing in XSSFWorkbook to avoid unnecessary work such as >> themes, styles etc >> 3) Pool the workbooks to avoid creating them every time (even though I need >> to be able to update them separately for every request) >> 4) Something else :-) >> >> I don't know about #1. Personally, I just save the output values, not the >> sheets. >> >> #2 seems like a whole lot of work, especially compared to #3. >> >> #3 is what I do: Load the workbook on first request and re-use for >> subsequent requests. If you know in advance what cells may be changed, you >> can cache those values when you first load a workbook. Then you hand the >> workbook out to a consumer, with the understanding that only those input >> values are changed. When done you restore the inputs to their >> original values. (And, in my case, I re-evaluate the workbook at that >> point, to restore things as much as possible.) >> >> #4? Well, profiling is good. A lot of times you think your bottleneck is >> one thing and it turns out to be something unexpected entirely. >> >> ===Blake=== >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@poi.apache.org For additional commands, e-mail: user-h...@poi.apache.org