Re: Fixing the XML batteries

Stefan Behnel Tue, 13 Dec 2011 11:42:24 -0800

Serhiy Storchaka, 13.12.2011 19:57:

13.12.11 16:59, Stefan Behnel написав(ла):

It matches my opinion though.


I would be glad to divide your intentions, however ElementTree looks less
documented than minidom


It's certainly a lot smaller, which makes its API easier to learn and remember.

and is not full replacement.

It's good enough for a surprisingly large part of all XML processing needs,and if you need more, there's lxml for that.

For example, I  haven't found how to get XML encoding.

True - lxml provides it, but plain ET doesn't. However, I can't think ofany major use cases where you'd care about the encoding of the originalinput file. Just use what suites your needs on the way back out. UTF-8 willusually do just fine.

Also, at use of ElementTree instead
of minidom the suffix "ns0:" is added to each element.

That's a "prefix", not a suffix. And since prefixes are basically uselessfor XML processing, it isn't commonly a problem whether they are called'nsXY' or 'abcdefg'. It's the parser's duty to handle them for you.

I do not see how to _create_ a new element


element = Element('tagname')

and to write it with <?xml ...?> header.


That's called a "declaration". You can get it with, e.g.,

ElementTree(element).write(encoding='utf8')

By default, ET doesn't write it unless it can put useful information intoit. (Note that the XML spec makes the declaration optional for XML 1.0serialisation as UTF-8.)

And DOM interface is more habitual for those who works with some other
languages.

Not really. DOM is also considered unwieldy in many other languages. Evenin a language as unwieldy as Java it's being frowned upon these days. InJavaScript, it has basically been replaced by jQuery, and many otherlanguages also have substantially more "natural" ways to deal with XML thanthe DOM.

It's true, though, that ElementTree isn't a widely known interface outsideof the Python world.

Yes, that's what C14N is there for, typically used for cryptography,
hashing, etc. However, MiniDOM doesn't implement that standard, so
you're on your own here.


MiniDOM quite suited me earlier in this respect. I will pass to C14N as
soon as I will be can.

The ET module is actually quite short (<1700 lines), so you can just
copy the Py2.7 version into your sources and optionally import it on
older Python releases. Since you only seem to depend on the serialiser
(which is worth using anyway because it is much faster in the Py2.7
version), older platform versions of cET should also work just fine with
that module copy, so you can basically just import everything from
xml.etree.cElementTree and use the ElementTree class and the tostring()
function from your own local version if the platform version is too old.

Note that ET is also still available as a separately installable
package, may or may not be simpler to use for you.


I thank, it is too bulky for my small scripts (which I have decided to
update from Python 2.3 or 2.4 to modern Python 3 and 2.6+). I will better
postpone high-grade migration for half-year or year while the Python 2.7
and 3.2 won't appear in stable versions of popular distributives.

In case you are only dealing with small in-house scripts, I'd suggestinstalling ET 1.3 (or, even better, lxml) on the machines where you want touse it. Then you no longer have to care about those dependencies.

I thank you for ET, it really is more convenient at some applications
(especially at work with the text in elements).

Careful. ;) I'm just the author of lxml, not of ET. That would be FredrikLundh.


Stefan

--
http://mail.python.org/mailman/listinfo/python-list

Re: Fixing the XML batteries

Reply via email to