I would also like to mention this sample code: https://github.com/pjfanning/excel-streaming-reader/blob/main/src/test/java/com/github/pjfanning/xlsx/CopyToSXSSFUtil.java
It uses CellUtil directly instead of using the copyRows methods. Using the CelUtil directly like this might reduce your need for POI changes. On Wed, 2 Jul 2025 at 18:22, PJ Fanning <fannin...@apache.org> wrote: > > There are too many applications that read Excel format files to know > how they all behave. (Answer to the side question - on your PR, I > suggested allowing SXSSF as an option and not forcing all users to use > SXSSF) > > I don't want to get too involved in this. Trial and error is the only > way forward with POI related development. > > POI is released infrequently so any PRs may take quite some time to be > in a released jar. > > If you submit PRs to POI or to excel-streaming-reader, I'll have a > look. Other POI contributors may also have a look. I'm busy on other > projects, so am not super enthusiastic about spending much time on > this. PRs will need test cases and not to break existing APIs (no > matter how annoying they are - they can be deprecated but not broken > or removed). > > On Wed, 2 Jul 2025 at 15:52, Piotr Zalas <pza...@onet.pl.invalid> wrote: > > > > Hello Devs, > > > > I'm implementing a change in Apache NiFi that optimises memory usage of > > copying of Excel sheet. We use com.github.pjfanning/excel-streaming-reader > > for reading Excel files, and Apache POI for writing output file. In the PR > > (https://github.com/apache/nifi/pull/10058/files) I got suggestion to > > include some of the code in POI project: > > 1. To add SXSSFRow#copyRowFrom(Row srcRow, CellCopyPolicy policy, > > CellCopyContext context) method, similar to method available in XSSFRow. In > > addition, a classes similar to XSSFRowShifter and XSSFRowColShifter would > > need to be implemented for SXSSFSheet, which are used by the above method. > > A non-trivial part would be to implement > > XSSFRowColShifter#updateRowFormulas, because it uses CTCell which isn't > > available in SXSSFCell. I would be grateful for some implementation tips > > regarding this method, how to substitute one object with another in the > > implementation. > > 2. To add some memory efficient method similar to XSSFSheet#copyRows(List<? > > extends Row> srcRows, int destStartRow, CellCopyPolicy policy) to > > SXSSFSheet class. Instead of using list of input rows, I'm thinking of > > using Sheet or row iterator to avoid storing all rows in memory. The tricky > > part here is that I need here to use StreamingSheet from > > excel-streaming-reader for memory efficiency, which doesn't implement many > > of Sheet interface methods, and I need to ensure compatibility with such > > reader. Perhaps a method cloneSheet(String newSheetName, Sheet sourceSheet) > > in SXSSFWorkbook would make sense? > > > > Are you ok with implementing some of the above changes in POI? If yes, let > > me know if there are some adjustments needed to the proposed API contract. > > > > As a side question, the SXSSFWorkbook javadoc mentions that by default use > > of shared strings is disabled and that this might break some clients trying > > to read saved file. Do you have examples of affected clients (e.g. MS > > Excel, Apple Numbers, Google Sheets import, some widely used library)? > > Trying to understand if migration away from XSSFWorkbook could break some > > NiFi user. > > > > Best, > > Piotr --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org