On Jul 17, 10:11 pm, Thomas Jollans <tho...@jollans.com> wrote: [Snip] > So, the contents of the file is identical, but Python 3 reads the whole > file, Python 2 reads only the data it uses. > > This looks like a simple optimisation: read the whole file at once, > instead of byte-by-byte, to improve performance when reading large > objects. (such as Python modules...) >
Good analysis and a nice catch. Thanks. It is likely that the intent is to optimize performance. > The question is: was storing multiple objects in sequence an intended > use of the marshal module? The documentation (http://docs.python.org/py3k/library/marshal.html) for marshal itself states (emphasis added by me), marshal.load(file)¶ Read *one value* from the open file and return it. If no valid value is read (e.g. because the data has a different Python version’s incompatible marshal format), raise EOFError, ValueError or TypeError. The file must be an open file object opened in binary mode ('rb' or 'r +b'). This suggests that support for reading multiple values is intended. > I doubt it. You can always wrap your data in > tuples or use pickle. > The code that I am moving to 3.x dates back to the python 1.5 days, when marshal was significantly faster than pickle and Zope was evolutionarily at the Bobo stage :-). I have switched the current code to pickle - makes more sense. The pickle files are a bit larger and loading it is a tad bit slower, but nothing that makes even a noticeable difference for my use case. Thanks. raj -- http://mail.python.org/mailman/listinfo/python-list