Awesome.  Thanks, I'll give this a try.

Oscar

Oscar Bastidas
Research Associate
University of Minnesota

On Mon, May 3, 2021, 6:04 AM Andreas Reichel <[email protected]>
wrote:

> Greetings.
>
> Please use the Excel Streaming Reader when reading large
> files: https://github.com/monitorjbl/excel-streaming-reader
>
> import com.monitorjbl.xlsx.StreamingReader;
>
> InputStream is = new FileInputStream(new File("/path/to/workbook.xlsx"));
> Workbook workbook = StreamingReader.builder()
>         .rowCacheSize(100)    // number of rows to keep in memory
> (defaults to 10)
>         .bufferSize(4096)     // buffer size to use when reading
> InputStream to file (defaults to 1024)
>         .open(is);            // InputStream or File for XLSX file
> (required)
>
>
>
> With the code above you can loop through your rows and write it to CSV.
> Best regards
> Andreas
>
>
> On Mon, 2021-05-03 at 05:31 -0500, Oscar Bastidas wrote:
> > Hello,
> >
> > I am trying to read a large Excel spreadsheet (60,000 rows) but I get
> > what
> > appears to be a memory leak error from the JVM when I use the
> > *XSSFWorkbook
> > *API.  I learned recently that there are size limitations on Excel
> > files
> > being read in this way and the Apache POI website specifically
> > recommends
> > reading the file in a streaming fashion instead of taking the whole
> > file in
> > memory.  To do this, POI recommends using something called *XLSX2CSV*
> > but
> > the provided link to teach how to use this returns a "page not found
> > error."
> >
> > Would someone please point me in the direction of how to handle
> > reading my
> > big Excel file?
> >
> > The Apache POI URL that contains the link to *XLSX2CSV* is:
> >
> > http://poi.apache.org/components/spreadsheet/limitations.html
> >
> > Thanks for any help anyone can provide.
> >
> > Oscar
> >
> > Oscar Bastidas
> > Research Associate
> > University of Minnesota
>
>

Reply via email to