Hi

I used the Apache POI to read an excel file. The file has 16000 rows and 50 
columns. 


Each cell is read as a string and they are put into a 2D array 


(e.g. a row is a string array with size 50 and 16000 rows are put in another 
array). 


the code looks like



try (FileInputStream fIP = new FileInputStream(selectedFile)) {
  XSSFWorkbook wb = new XSSFWorkbook(fIP);
  XSSFSheet ws = wb.getSheetAt(0);
  maxRows = ws.getLastRowNum() + 1;
  maxColumns = ws.getRow(0).getLastCellNum();
  theRows = new OneRow[maxRows]; 
  Iterator< Row > rowIt = ws.iterator();
  XSSFRow row;
  int i = 0;
  long chs = 0;
  while ( rowIt.hasNext() ) {
    row = (XSSFRow) rowIt.next();
    Iterator< Cell > cellIt = row.cellIterator();
    String [] str = new String[maxColumns];
    int j = 0;
    long ch = 0;
    while ( cellIt.hasNext() ) {
      Cell cell = cellIt.next();
      str[ j ] = cell.getStringCellValue();
             ch += str[ j ].length(); // sum one row's length
      System.out.print( str[ j ].length() + " " );
      ++j;
    }
    System.out.print( "\n  ->" + ch + "\n" );
    chs += ch;   // sum total characters length
    theRows[ i ] = new OneRow( maxColumns );
    theRows[ i ].add( str );
    ++i;
  }
  System.out.print( "\n  ==>" + chs + "\n" );
}



the program also counts the length of each cell, ch and the size of all 


rows are updated with chs += ch. Then the total number of characters are chs. 


The value is 3,230,798 (3M chars) but the used heap is about 1GB. That means a 


character is consuming 330 bytes!! (more or less).



Using the profilers, I see that needs a lot of heap and I have to increase 


the heap size to 2048. Also the cpu time is devoted to GC which is undesirable.



I have to say that XSSFSheet ws = wb.getSheetAt(0); is the most time and 


memory consuming part. In order to investigate that, I manually count the data 


structure sizes with ch and chs.




Please see the profiler at  https://i.stack.imgur.com/ZfLHj.jpg

In the memory chart and from left, the first peak is where it finishes 


loading the excel file.



 Regards,
Mahmood

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Reply via email to