I wrote this:

a = np.zeros((p.max_row, p.max_column), dtype=object)
for y, row in enumerate(p.rows):
      for cell in row:
            print (cell.value)
            a[y] = cell.value 
     print (a[y])


For one of the cells, I see

NM_198576.3
['NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3']

 
These are 50 NM_198576.3 in a[y] and 50 is the number of columns in my excel 
file (p.max_column)



The excel file looks like

CHR1     11,202,100     NM_198576.3     PASS     3.08932    G|B|C     -    .   
.   .



Note that in each row, some cells are '-' or '.' only. I want to read all cells 
as string. Then I will write the matrix in a file and my main code (java) will 
process that. I chose openpyxl for reading excel files, because Apache POI (a 
java package for manipulating excel files) consumes huge memory even for medium 
files.

So my python script only transforms an xlsx file to a txt file keeping the cell 
positions and formats.

Any suggestion?

Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to