Hello Devs,

I'm implementing a change in Apache NiFi that optimises memory usage of copying 
of Excel sheet. We use com.github.pjfanning/excel-streaming-reader for reading 
Excel files, and Apache POI for writing output file. In the PR 
(https://github.com/apache/nifi/pull/10058/files) I got suggestion to include 
some of the code in POI project:
1. To add SXSSFRow#copyRowFrom(Row srcRow, CellCopyPolicy policy, 
CellCopyContext context) method, similar to method available in XSSFRow. In 
addition, a classes similar to XSSFRowShifter and XSSFRowColShifter would need 
to be implemented for SXSSFSheet, which are used by the above method. A 
non-trivial part would be to implement XSSFRowColShifter#updateRowFormulas, 
because it uses CTCell which isn't available in SXSSFCell. I would be grateful 
for some implementation tips regarding this method, how to substitute one 
object with another in the implementation.
2. To add some memory efficient method similar to XSSFSheet#copyRows(List<? 
extends Row> srcRows, int destStartRow, CellCopyPolicy policy) to SXSSFSheet 
class. Instead of using list of input rows, I'm thinking of using Sheet or row 
iterator to avoid storing all rows in memory. The tricky part here is that I 
need here to use StreamingSheet from excel-streaming-reader for memory 
efficiency, which doesn't implement many of Sheet interface methods, and I need 
to ensure compatibility with such reader. Perhaps a method cloneSheet(String 
newSheetName, Sheet sourceSheet) in SXSSFWorkbook would make sense?

Are you ok with implementing some of the above changes in POI? If yes, let me 
know if there are some adjustments needed to the proposed API contract.

As a side question, the SXSSFWorkbook javadoc mentions that by default use of 
shared strings is disabled and that this might break some clients trying to 
read saved file. Do you have examples of affected clients (e.g. MS Excel, Apple 
Numbers, Google Sheets import, some widely used library)? Trying to understand 
if migration away from XSSFWorkbook could break some NiFi user.

Best,
Piotr

Reply via email to