Greetings.
Please use the Excel Streaming Reader when reading large
files: https://github.com/monitorjbl/excel-streaming-reader
import com.monitorjbl.xlsx.StreamingReader;
InputStream is = new FileInputStream(new File("/path/to/workbook.xlsx"));
Workbook workbook = StreamingReader.builder()
.rowCacheSize(100) // number of rows to keep in memory (defaults to
10)
.bufferSize(4096) // buffer size to use when reading InputStream to
file (defaults to 1024)
.open(is); // InputStream or File for XLSX file (required)
With the code above you can loop through your rows and write it to CSV.
Best regards
Andreas
On Mon, 2021-05-03 at 05:31 -0500, Oscar Bastidas wrote:
> Hello,
>
> I am trying to read a large Excel spreadsheet (60,000 rows) but I get
> what
> appears to be a memory leak error from the JVM when I use the
> *XSSFWorkbook
> *API. I learned recently that there are size limitations on Excel
> files
> being read in this way and the Apache POI website specifically
> recommends
> reading the file in a streaming fashion instead of taking the whole
> file in
> memory. To do this, POI recommends using something called *XLSX2CSV*
> but
> the provided link to teach how to use this returns a "page not found
> error."
>
> Would someone please point me in the direction of how to handle
> reading my
> big Excel file?
>
> The Apache POI URL that contains the link to *XLSX2CSV* is:
>
> http://poi.apache.org/components/spreadsheet/limitations.html
>
> Thanks for any help anyone can provide.
>
> Oscar
>
> Oscar Bastidas
> Research Associate
> University of Minnesota