Hello Devs, I'm implementing a change in Apache NiFi that optimises memory usage of copying of Excel sheet. We use com.github.pjfanning/excel-streaming-reader for reading Excel files, and Apache POI for writing output file. In the PR (https://github.com/apache/nifi/pull/10058/files) I got suggestion to include some of the code in POI project: 1. To add SXSSFRow#copyRowFrom(Row srcRow, CellCopyPolicy policy, CellCopyContext context) method, similar to method available in XSSFRow. In addition, a classes similar to XSSFRowShifter and XSSFRowColShifter would need to be implemented for SXSSFSheet, which are used by the above method. A non-trivial part would be to implement XSSFRowColShifter#updateRowFormulas, because it uses CTCell which isn't available in SXSSFCell. I would be grateful for some implementation tips regarding this method, how to substitute one object with another in the implementation. 2. To add some memory efficient method similar to XSSFSheet#copyRows(List<? extends Row> srcRows, int destStartRow, CellCopyPolicy policy) to SXSSFSheet class. Instead of using list of input rows, I'm thinking of using Sheet or row iterator to avoid storing all rows in memory. The tricky part here is that I need here to use StreamingSheet from excel-streaming-reader for memory efficiency, which doesn't implement many of Sheet interface methods, and I need to ensure compatibility with such reader. Perhaps a method cloneSheet(String newSheetName, Sheet sourceSheet) in SXSSFWorkbook would make sense?
Are you ok with implementing some of the above changes in POI? If yes, let me know if there are some adjustments needed to the proposed API contract. As a side question, the SXSSFWorkbook javadoc mentions that by default use of shared strings is disabled and that this might break some clients trying to read saved file. Do you have examples of affected clients (e.g. MS Excel, Apple Numbers, Google Sheets import, some widely used library)? Trying to understand if migration away from XSSFWorkbook could break some NiFi user. Best, Piotr