I would also like to mention this sample code:
https://github.com/pjfanning/excel-streaming-reader/blob/main/src/test/java/com/github/pjfanning/xlsx/CopyToSXSSFUtil.java

It uses CellUtil directly instead of using the copyRows methods. Using
the CelUtil directly like this might reduce your need for POI changes.

On Wed, 2 Jul 2025 at 18:22, PJ Fanning <fannin...@apache.org> wrote:
>
> There are too many applications that read Excel format files to know
> how they all behave. (Answer to the side question - on your PR, I
> suggested allowing SXSSF as an option and not forcing all users to use
> SXSSF)
>
> I don't want to get too involved in this. Trial and error is the only
> way forward with POI related development.
>
> POI is released infrequently so any PRs may take quite some time to be
> in a released jar.
>
> If you submit PRs to POI or to excel-streaming-reader, I'll have a
> look. Other POI contributors may also have a look. I'm busy on other
> projects, so am not super enthusiastic about spending much time on
> this. PRs will need test cases and not to break existing APIs (no
> matter how annoying they are - they can be deprecated but not broken
> or removed).
>
> On Wed, 2 Jul 2025 at 15:52, Piotr Zalas <pza...@onet.pl.invalid> wrote:
> >
> > Hello Devs,
> >
> > I'm implementing a change in Apache NiFi that optimises memory usage of 
> > copying of Excel sheet. We use com.github.pjfanning/excel-streaming-reader 
> > for reading Excel files, and Apache POI for writing output file. In the PR 
> > (https://github.com/apache/nifi/pull/10058/files) I got suggestion to 
> > include some of the code in POI project:
> > 1. To add SXSSFRow#copyRowFrom(Row srcRow, CellCopyPolicy policy, 
> > CellCopyContext context) method, similar to method available in XSSFRow. In 
> > addition, a classes similar to XSSFRowShifter and XSSFRowColShifter would 
> > need to be implemented for SXSSFSheet, which are used by the above method. 
> > A non-trivial part would be to implement 
> > XSSFRowColShifter#updateRowFormulas, because it uses CTCell which isn't 
> > available in SXSSFCell. I would be grateful for some implementation tips 
> > regarding this method, how to substitute one object with another in the 
> > implementation.
> > 2. To add some memory efficient method similar to XSSFSheet#copyRows(List<? 
> > extends Row> srcRows, int destStartRow, CellCopyPolicy policy) to 
> > SXSSFSheet class. Instead of using list of input rows, I'm thinking of 
> > using Sheet or row iterator to avoid storing all rows in memory. The tricky 
> > part here is that I need here to use StreamingSheet from 
> > excel-streaming-reader for memory efficiency, which doesn't implement many 
> > of Sheet interface methods, and I need to ensure compatibility with such 
> > reader. Perhaps a method cloneSheet(String newSheetName, Sheet sourceSheet) 
> > in SXSSFWorkbook would make sense?
> >
> > Are you ok with implementing some of the above changes in POI? If yes, let 
> > me know if there are some adjustments needed to the proposed API contract.
> >
> > As a side question, the SXSSFWorkbook javadoc mentions that by default use 
> > of shared strings is disabled and that this might break some clients trying 
> > to read saved file. Do you have examples of affected clients (e.g. MS 
> > Excel, Apple Numbers, Google Sheets import, some widely used library)? 
> > Trying to understand if migration away from XSSFWorkbook could break some 
> > NiFi user.
> >
> > Best,
> > Piotr

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org

Reply via email to