Greetings.

Please use the Excel Streaming Reader when reading large
files: https://github.com/monitorjbl/excel-streaming-reader

import com.monitorjbl.xlsx.StreamingReader;

InputStream is = new FileInputStream(new File("/path/to/workbook.xlsx"));
Workbook workbook = StreamingReader.builder()
        .rowCacheSize(100)    // number of rows to keep in memory (defaults to 
10)
        .bufferSize(4096)     // buffer size to use when reading InputStream to 
file (defaults to 1024)
        .open(is);            // InputStream or File for XLSX file (required)



With the code above you can loop through your rows and write it to CSV.
Best regards
Andreas


On Mon, 2021-05-03 at 05:31 -0500, Oscar Bastidas wrote:
> Hello,
> 
> I am trying to read a large Excel spreadsheet (60,000 rows) but I get
> what
> appears to be a memory leak error from the JVM when I use the
> *XSSFWorkbook
> *API.  I learned recently that there are size limitations on Excel
> files
> being read in this way and the Apache POI website specifically
> recommends
> reading the file in a streaming fashion instead of taking the whole
> file in
> memory.  To do this, POI recommends using something called *XLSX2CSV*
> but
> the provided link to teach how to use this returns a "page not found
> error."
> 
> Would someone please point me in the direction of how to handle
> reading my
> big Excel file?
> 
> The Apache POI URL that contains the link to *XLSX2CSV* is:
> 
> http://poi.apache.org/components/spreadsheet/limitations.html
> 
> Thanks for any help anyone can provide.
> 
> Oscar
> 
> Oscar Bastidas
> Research Associate
> University of Minnesota

Reply via email to