I wrote this: a = np.zeros((p.max_row, p.max_column), dtype=object) for y, row in enumerate(p.rows): for cell in row: print (cell.value) a[y] = cell.value print (a[y])
For one of the cells, I see NM_198576.3 ['NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'] These are 50 NM_198576.3 in a[y] and 50 is the number of columns in my excel file (p.max_column) The excel file looks like CHR1 11,202,100 NM_198576.3 PASS 3.08932 G|B|C - . . . Note that in each row, some cells are '-' or '.' only. I want to read all cells as string. Then I will write the matrix in a file and my main code (java) will process that. I chose openpyxl for reading excel files, because Apache POI (a java package for manipulating excel files) consumes huge memory even for medium files. So my python script only transforms an xlsx file to a txt file keeping the cell positions and formats. Any suggestion? Regards, Mahmood -- https://mail.python.org/mailman/listinfo/python-list