Re: [Tutor] Reading/dealing/matching with truly huge (ascii) files

2012-02-23 Thread Alan Gauld
On 24/02/12 05:11, Elaina Ann Hyde wrote: Ok, if I use awk I seperate the file into an edible 240MB chunk, Why awk? Python is nearly always faster than awk... Even nawk or gawk. awk is a great language but I rarely use it nowadays other than for one liners because perl/python/ruby are all gene

Re: [Tutor] Reading/dealing/matching with truly huge (ascii) files

2012-02-23 Thread Asokan Pichai
Did you try loadtxt() from numpy? http://mail.scipy.org/pipermail/scipy-user/2010-August/026431.html the poster above notes that 2.5 million lines and 10 columns takes 3 minutes to load. Asokan Pichai ___ Tutor maillist - Tutor@python.org To unsubsc

Re: [Tutor] Reading/dealing/matching with truly huge (ascii) files

2012-02-23 Thread Elaina Ann Hyde
On Thu, Feb 23, 2012 at 9:07 PM, Alan Gauld wrote: > On 23/02/12 01:55, Elaina Ann Hyde wrote: > ns/7.2/lib/python2.7/site-**packages/asciitable-0.8.0-py2.**7.egg/asciitable/core.py", > > >> line 158, in get_lines >> lines = table.splitlines() >> MemoryError >> -- >> So thi

Re: [Tutor] Reading/dealing/matching with truly huge (ascii) files

2012-02-23 Thread Alan Gauld
On 23/02/12 01:55, Elaina Ann Hyde wrote: ns/7.2/lib/python2.7/site-packages/asciitable-0.8.0-py2.7.egg/asciitable/core.py", line 158, in get_lines lines = table.splitlines() MemoryError -- So this means I don't have enough memory to run through the large file? Probab

Re: [Tutor] Reading/dealing/matching with truly huge (ascii) files

2012-02-23 Thread Peter Otten
Elaina Ann Hyde wrote: > Thanks for all the helpful hints, I really like the idea of using > distances > instead of a limit. Walter was right that the 'i !=j' condition was > causing problems. I think that Alan and Steven's use of the index > separately was great as it makes this much easier to

Re: [Tutor] Reading/dealing/matching with truly huge (ascii) files

2012-02-22 Thread Mark Lawrence
On 23/02/2012 01:55, Elaina Ann Hyde wrote: [big snips] Hi Elaina, I'm sorry but I can't help with your problem with the memory cos I don't know enough about the combination of Python and Unix wrt memory management. However can I suggest that you use more whitespace in your code to make it

Re: [Tutor] Reading/dealing/matching with truly huge (ascii) files

2012-02-22 Thread Elaina Ann Hyde
On Wed, Feb 22, 2012 at 8:50 PM, Peter Otten <__pete...@web.de> wrote: > Elaina Ann Hyde wrote: > > > So, Python question of the day: I have 2 files that I could normally > just > > read in with asciitable, The first file is a 12 column 8000 row table > that > > I have read in via asciitable and

Re: [Tutor] Reading/dealing/matching with truly huge (ascii) files

2012-02-22 Thread Peter Otten
Elaina Ann Hyde wrote: > So, Python question of the day: I have 2 files that I could normally just > read in with asciitable, The first file is a 12 column 8000 row table that > I have read in via asciitable and manipulated. The second file is > enormous, has over 50,000 rows and about 20 column

Re: [Tutor] Reading/dealing/matching with truly huge (ascii) files

2012-02-22 Thread Steven D'Aprano
On Wed, Feb 22, 2012 at 04:44:57PM +1100, Elaina Ann Hyde wrote: > So, Python question of the day: I have 2 files that I could normally just > read in with asciitable, The first file is a 12 column 8000 row table that > I have read in via asciitable and manipulated. The second file is > enormous,

Re: [Tutor] Reading/dealing/matching with truly huge (ascii) files

2012-02-22 Thread Walter Prins
Hi Elaina, On 22 February 2012 05:44, Elaina Ann Hyde wrote: > #select the value if it is very, very, very close >     if i != j and Radeg[i] <= (Radeg2[j]+0.01) and Radeg[i] Alan's pretty much said what I was thinking, but I have an additional question/concern: Why do you inclu

Re: [Tutor] Reading/dealing/matching with truly huge (ascii) files

2012-02-22 Thread Alan Gauld
On 22/02/12 05:44, Elaina Ann Hyde wrote: file is enormous, has over 50,000 rows and about 20 columns. On modern computers its not that enormous - probably around 10M? But there are techniques for this which we can cover another time is you do hit files bigger than fit in memory. I didn't g