cshirky schrieb:
Newbie question:
I'm trying to turn a large XML file (~7G compressed) into a YAML file,
and my program seems to be buffering the input.
IOtest.py is just
import sys
for line in sys.stdin.readlines():
print line
but when I run
$ gzcat bigXMLfile.gz | IOtest.py
but it hangs then dies.
The goal of the program is to build a YAML file with print statements,
rather than building a gigantic nested dictionary, but I am obviously
doing something wrong in passing input through without buffering. Any
advice gratefully fielded.
readlines() reads all of the file into the memory. Try using xreadlines,
the generator-version, instead. And I'm not 100% sure, but I *think* doing
for line in sys.stdin:
...
does exactly that.
Diez
--
http://mail.python.org/mailman/listinfo/python-list