Re: Why does StringIO discard its initial value?

David Fraser Fri, 15 Apr 2005 08:56:22 -0700

Raymond Hettinger wrote:

[David Fraser]

Others may find this helpful ; it's a pure Python wrapper for cStringIO
that makes it behave like StringIO in not having initialized objects
readonly. Would it be an idea to extend cStringIO like this in the
standard library? It shouldn't lose performance if used like a standard
cStringIO, but it prevents frustration :-)

IMO, that would be a step backwards.  Initializing the object and then
writing to it is not a good practice.  The cStringIOAPI needs to be as
file-like as possible.  With files, we create an emtpy object and then
starting writing (the append mode for existing files is a different story).
Good code ought to maintain that parallelism so that it is easier to
substitute a real file for a writeable cStringIO object.

This whole thread (except for the documentation issue which has been
fixed) is about fighting the API rather than letting it be a guide to good
code.

If there were something wrong with the API, Guido would have long
since fired up the time machine and changed the timeline so that all
would be as right as rain ;-)

But surely the whole point of files is that you can do more than either creating a new file or appending to an existing one (seek, write?)

The reason I wrote this was to enable manipulating zip files inside zip files, in memory. This is on translate.sourceforge.net - I wanted to manipulate Mozilla XPI files, and replace file contents etc. within the XPI. The XPI files are zip format that contains jars inside (also zip format). I needed to alter the contents of files within the inner zip files.

The zip classes in Python can handle adding files but not replacing them. The cStringIO is as described above.

So I created extensions to the zipfile.ZipFile class that allow it to delete existing files, and add them again with new contents (thus replacing them).

And I created wStringIO so that I could do this all inplace on the existing zip files.

This all required some extra hacking because of the dual-layer zip files.

But all this as far as I see would have been really tricky using the existing zipfile and cStringIO classes, which both assume (conceptually) that files are either readable or new or merely appendable (for zipfile).

The problem for me was not that cStringIO classes are too similar to files, it was that they are too dissimilar. All of this would work with either StringIO (but too slow) or real files (but I needed it in memory because of the zipfiles being inside other zip files).

Am I missing something?

David
--
http://mail.python.org/mailman/listinfo/python-list

Re: Why does StringIO discard its initial value?

Reply via email to