Tom Goddard <godd...@cgl.ucsf.edu> added the comment: I agree that having such large Python code files is a rare circumstance and optimizing the byte-code compiler for that should be a low priority.
Thanks for the cpickle suggestion. The Chimera session file Python code is mostly large nested dictionaries and sequences. I tested cPickle and repr() to embed data structures in the Python code getting rather larger file size because the 8-bit characters became 4 bytes in the text file string (e.g. "\xe8"). Using cPickle, and base64 encoding dropped the file size by about a factor of 2.5 and cPickle, bzip2 or zlib compression, and base64 dropped the size another factor of 2. The big win is that the byte code compilation used 150 Mbytes and 5 seconds instead of 2 Gbytes and 15 minutes of thrashing for a 40 Mbyte python file. I think our reason for not using pickled data originally in the session files was because we like users to be able to look at and edit the session files in a text editor. (This is research software where such hacks sometimes are handy.) But the especially large data structures in the sessions can't reasonably be meddled with by users so pickling should be fine. Pickling adds about 15% to the session save time, and reduces session opening by about the same amount. Compression slows the save down another 15% and probably is not worth the factor of 2 reduction in file size in our case. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue5557> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com