Re: Out of memory while reading excel file

2017-05-12 Thread codewizard
On Thursday, May 11, 2017 at 5:01:57 AM UTC-4, Mahmood Naderan wrote: > Excuse me, I changed > > csv.writer(outstream) > > to > > csv.writer(outstream, delimiter =' ') > > > It puts space between cells and omits "" around some content. However, > between two lines there is a new empty line.

Re: Out of memory while reading excel file

2017-05-12 Thread eryk sun
On Fri, May 12, 2017 at 8:03 PM, Peter Otten <__pete...@web.de> wrote: > I don't have a Windows system to test, but doesn't that mean that on Windows > > with open("tmp.csv", "w") as f: > csv.writer(f).writerows([["one"], ["two"]]) > with open("tmp.csv", "rb") as f: > print(f.read()) > > wo

Re: Out of memory while reading excel file

2017-05-12 Thread Peter Otten
Pavol Lisy wrote: > On 5/11/17, Peter Otten <__pete...@web.de> wrote: >> Mahmood Naderan via Python-list wrote: >>> between two lines there is a new empty line. In other word, the first >>> line is the first row of excel file. The second line is empty ("\n") and >>> the third line is the second r

Re: Out of memory while reading excel file

2017-05-12 Thread Pavol Lisy
On 5/11/17, Peter Otten <__pete...@web.de> wrote: > Mahmood Naderan via Python-list wrote: > >> Excuse me, I changed >> >> csv.writer(outstream) >> >> to >> >> csv.writer(outstream, delimiter =' ') >> >> >> It puts space between cells and omits "" around some content. > > If your data doesn't conta

Re: Out of memory while reading excel file

2017-05-11 Thread Mahmood Naderan via Python-list
Thanks a lot for suggestions. It is now solved. Regards, Mahmood -- https://mail.python.org/mailman/listinfo/python-list

Re: Out of memory while reading excel file

2017-05-11 Thread Peter Otten
Mahmood Naderan via Python-list wrote: > Excuse me, I changed > > csv.writer(outstream) > > to > > csv.writer(outstream, delimiter =' ') > > > It puts space between cells and omits "" around some content. If your data doesn't contain any spaces that's fine. Otherwise you need a way to dist

Re: Out of memory while reading excel file

2017-05-11 Thread Mahmood Naderan via Python-list
Excuse me, I changed csv.writer(outstream) to csv.writer(outstream, delimiter =' ') It puts space between cells and omits "" around some content. However, between two lines there is a new empty line. In other word, the first line is the first row of excel file. The second line is empty ("\

Re: Out of memory while reading excel file

2017-05-11 Thread Mahmood Naderan via Python-list
Thanks. That code is so simple and works. However, there are things to be considered. With the CSV format, cells in a row are separated by ',' and for some cells it writes "" around the cell content. So, if the excel looks like CHR1 11,232,445 The output file looks like CHR1,"11,232,4

Re: Out of memory while reading excel file

2017-05-11 Thread Peter Otten
Mahmood Naderan via Python-list wrote: > I wrote this: > > a = np.zeros((p.max_row, p.max_column), dtype=object) > for y, row in enumerate(p.rows): > for cell in row: > print (cell.value) > a[y] = cell.value In the line above you overwrite the row in the numpy array

Re: Out of memory while reading excel file

2017-05-11 Thread Mahmood Naderan via Python-list
I wrote this: a = np.zeros((p.max_row, p.max_column), dtype=object) for y, row in enumerate(p.rows): for cell in row: print (cell.value) a[y] = cell.value print (a[y]) For one of the cells, I see NM_198576.3 ['NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_1985

Re: Out of memory while reading excel file

2017-05-10 Thread Peter Otten
Mahmood Naderan via Python-list wrote: >>a = numpy.zeros((ws.max_row, ws.max_column), dtype=float) >>for y, row in enumerate(ws.rows): >> a[y] = [cell.value for cell in row] > > > > Peter, > > As I used this code, it gave me an error that cannot convert string to > float for the first cell.

Re: Out of memory while reading excel file

2017-05-10 Thread Mahmood Naderan via Python-list
Hi, I used the old fashion coding style to create a matrix and read/add the cells. W = load_workbook(fname, read_only = True) p = W.worksheets[0] m = p.max_row n = p.max_column arr = np.empty((m, n), dtype=object) for r in range(1, m): for c in range(1, n): d = p.cell(row=r, colu

Re: Out of memory while reading excel file

2017-05-10 Thread Mahmood Naderan via Python-list
>a = numpy.zeros((ws.max_row, ws.max_column), dtype=float) >for y, row in enumerate(ws.rows): > a[y] = [cell.value for cell in row] Peter, As I used this code, it gave me an error that cannot convert string to float for the first cell. All cells are strings. Regards, Mahmood -- https://

Re: Out of memory while reading excel file

2017-05-10 Thread Mahmood Naderan via Python-list
Hi, I am confused with that. If you say that numpy is not suitable for my case and may have large overhead, what is the alternative then? Do you mean that numpy is a good choice here while we can reduce its overhead? Regards, Mahmood -- https://mail.python.org/mailman/listinfo/python-list

Re: Out of memory while reading excel file

2017-05-10 Thread Mahmood Naderan via Python-list
On Wed, 5/10/17, Peter Otten <__pete...@web.de> wrote: Subject: Re: Out of memory while reading excel file To: python-list@python.org Date: Wednesday, May 10, 2017, 6:30 PM Mahmood Naderan via Python-list wrote: > Well actually cells are treated as strings and not integer

Re: Out of memory while reading excel file

2017-05-10 Thread Peter Otten
Mahmood Naderan via Python-list wrote: > Well actually cells are treated as strings and not integer or float > numbers. May I ask why you are using numpy when you are dealing with strings? If you provide a few details about what you are trying to achieve someone may be able to suggest a workabl

Re: Out of memory while reading excel file

2017-05-10 Thread Irmen de Jong
On 10-5-2017 17:12, Mahmood Naderan wrote: > So, I think numpy is unable to manage the memory. That assumption is very likely to be incorrect. >> np.array([[i.value for i in j] for j in p.rows]) I think the problem is in the way you feed your excel data into the numpy array constructor. The co

Re: Out of memory while reading excel file

2017-05-10 Thread Mahmood Naderan via Python-list
rows. Mine is about 100k. Currently, the task manager shows about 4GB of ram usage while working with numpy. Regards, Mahmood On Wed, 5/10/17, Peter Otten <__pete...@web.de> wrote: Subject: Re: Out of memory while reading excel file To: pytho

Re: Out of memory while reading excel file

2017-05-10 Thread Peter Otten
Mahmood Naderan via Python-list wrote: > Thanks for your reply. The openpyxl part (reading the workbook) works > fine. I printed some debug information and found that when it reaches the > np.array, after some 10 seconds, the memory usage goes high. > > > So, I think numpy is unable to manage th

Re: Out of memory while reading excel file

2017-05-10 Thread Mahmood Naderan via Python-list
Thanks for your reply. The openpyxl part (reading the workbook) works fine. I printed some debug information and found that when it reaches the np.array, after some 10 seconds, the memory usage goes high. So, I think numpy is unable to manage the memory. Regards, Mahmood On Wednesday, Ma

Re: Out of memory while reading excel file

2017-05-10 Thread Peter Otten
Mahmood Naderan via Python-list wrote: > Hello, > > The following code which uses openpyxl and numpy, fails to read large > Excel (xlsx) files. The file si 20Mb which contains 100K rows and 50 > columns. > > > > W = load_workbook(fname, read_only = True) > > p = W.worksheets[0] > > a=[] > >