New submission from christen:

September 11, 2007 I downloaded py 3.k

The good news :
Under Windows, Python 3k properly reads files larger than 4 Go (in
contrast to python 2.5 that skips some lines, see below)

The bad news : py 3k is very slow compared to py 2.5; see the results below
the code is 
it reads a 4.9 Go file of 81,017,719 lines (a genbank entry of bacterial
sequences)

#######################
import time 
print (time.localtime())
fichin=open(r'D:\pythons\16s\total_gb_161_16S.gb')
t0= time.localtime()
print (t0)
i=0

for li in fichin:
        i+=1
        if i%1000000==0: 
                print (i,time.localtime())
        
fichin.close()
print ()
print (i)
print (time.localtime())
#########################


I got the following results (Windows XP 64) on the same machine, using
either py 3k or py 2.5
As soon as my BSD and Linux machines are done with calculations, I will
try that on them.
Best
Richard Christen


python 3k

(2007, 9, 10, 13, 53, 36, 0, 253, 1)
(2007, 9, 10, 13, 53, 36, 0, 253, 1)
1000000 (2007, 9, 10, 13, 53, 49, 0, 253, 1)
2000000 (2007, 9, 10, 13, 54, 3, 0, 253, 1)
3000000 (2007, 9, 10, 13, 54, 18, 0, 253, 1)
4000000 (2007, 9, 10, 13, 54, 32, 0, 253, 1)
5000000 (2007, 9, 10, 13, 54, 47, 0, 253, 1)
....
77000000 (2007, 9, 10, 14, 14, 55, 0, 253, 1)
78000000 (2007, 9, 10, 14, 15, 9, 0, 253, 1)
79000000 (2007, 9, 10, 14, 15, 22, 0, 253, 1)
80000000 (2007, 9, 10, 14, 15, 36, 0, 253, 1)
81000000 (2007, 9, 10, 14, 15, 49, 0, 253, 1)

81017719    #this is the proper number of lines 
(2007, 9, 10, 14, 15, 50, 0, 253, 1)


Python 2.5

(2007, 9, 10, 14, 18, 33, 0, 253, 1)
(2007, 9, 10, 14, 18, 33, 0, 253, 1)
(1000000, (2007, 9, 10, 14, 18, 34, 0, 253, 1))
(2000000, (2007, 9, 10, 14, 18, 34, 0, 253, 1))
(3000000, (2007, 9, 10, 14, 18, 35, 0, 253, 1))
(4000000, (2007, 9, 10, 14, 18, 35, 0, 253, 1))
(5000000, (2007, 9, 10, 14, 18, 36, 0, 253, 1))
...
(77000000, (2007, 9, 10, 14, 19, 10, 0, 253, 1))
(78000000, (2007, 9, 10, 14, 19, 11, 0, 253, 1))
(79000000, (2007, 9, 10, 14, 19, 11, 0, 253, 1))
(80000000, (2007, 9, 10, 14, 19, 12, 0, 253, 1))
(81000000, (2007, 9, 10, 14, 19, 12, 0, 253, 1))
()
81014962      #python 2.5 missed some lines !!!!
(2007, 9, 10, 14, 19, 12, 0, 253, 1)

----------
components: Tests
messages: 55777
nosy: [EMAIL PROTECTED]
severity: normal
status: open
title: reading large files
type: behavior
versions: Python 3.0

__________________________________
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue1141>
__________________________________
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to