Steven D'Aprano wrote:
I'm no longer *claiming* anything, I'm *asking* whether random access to a 4GB XML file is something that is credible or useful. It is my understanding that XML is particularly ill-suited to random access once the amount of data is too large to fit in RAM.
An XML file doesn't contain any indexing information, so random access to a large XML file is very inefficient. You can build (or precompute) index information and store in a separate file, of course, but that's hardly something that's useful in the general case.
And as I said before, the only use case for *huge* XML files I've ever seen used in practice is to store large streams of record-style data; data that's intended to be consumed by sequential processes (and you can do a lot with sequential processing these days; for those interested in this, digging up a few review papers on "data stream processing" might be a good way to waste some time).
Document-style XML usually fits into memory on modern machines; structures larger than that are usually split into different parts (e.g. using XInclude) and stored in a container file.
Random *modifications* to an arbitrary XML file cannot be done, as long as you store the file in a standard file system. And if you invent your own format, it's no longer an XML file.
</F> -- http://mail.python.org/mailman/listinfo/python-list