Hi all

This question is mostly to satisfy my curiosity.

In my app I use xml to represent certain objects, such as form definitions and process definitions.

They are stored in a database. I use etree.tostring() when storing them and etree.fromstring() when reading them back. They can be quite large, so I use gzip to compress them before storing them as a blob.

The sequence of events when reading them back is -
   - select gzip'd data from database
   - run gzip.decompress() to convert to a string
   - run etree.fromstring() to convert to an etree object

I was wondering if I could avoid having the unzipped string in memory, and create the etree object directly from the gzip'd data. I came up with this -

   - select gzip'd data from database
   - create a BytesIO object - fd = io.BytesIO(data)
   - use gzip to open the object - gf = gzip.open(fd)
   - run etree.parse(gf) to convert to an etree object

It works.

But I don't know what goes on under the hood, so I don't know if this achieves anything. If any of the steps involves decompressing the data and storing the entire string in memory, I may as well stick to my present approach.

Any thoughts?

Frank Millman

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to