On Jan 26, 10:22 am, redbaron <ivanov.ma...@gmail.com> wrote: > I've one big (6.9 Gb) .gz file with text inside it. > zcat bigfile.gz > /dev/null does the job in 4 minutes 50 seconds > > python code have been doing the same job for 25 minutes and still > doesn't finish =( the code is simpliest I could ever imagine: > > def main(): > fh = gzip.open(sys.argv[1]) > all(fh) > > As far as I understand most of the time it executes C code, so pythons > no overhead should be noticible. Why is it so slow?
Look what's happening in both operations. The zcat operation is simply uncompressing your data and dumping directly to /dev/null. Nothing is done with the data as it's uncompressed. On the other hand, when you call 'all(fh)', you're iterating through every element in in bigfile.gz. In other words, you're reading the file and scanning it for newlines versus simply running the decompression operation. -- http://mail.python.org/mailman/listinfo/python-list