On Nov 4, 1:06 pm, Kee Nethery <k...@kagi.com> wrote: > On Nov 3, 2009, at 5:27 PM, John Machin wrote: > > > > > On Nov 4, 11:01 am, Kee Nethery <k...@kagi.com> wrote:
> >> Why is this not working and what do I need to do to use Elementtree > >> with unicode? > > > What you need to do is NOT feed it unicode. You feed it a str object > > and it gets decoded according to the encoding declaration found in the > > first line. > > That it uses "the encoding declaration found in the first line" is the > nugget of data that is not in the documentation that has stymied me > for days. Thank you! And under the "don't repeat" principle, it shouldn't be in the Elementtree docs; it's nothing special about ET -- it's part of the definition of an XML document (which for universal loss-free transportability naturally must be encoded somehow, and the document must state what its own encoding is (if it's not the default (UTF-8))). > The other thing that has been confusing is that I've been using "dump" > to view what is in the elementtree instance and the non-ASCII > characters have been displayed as "numbered > entities" (<city>柏市</city>) and I know that is not the > representation I want the data to be in. A co-worker suggested that > instead of "dump" that I use "et.tostring(theResponseXml, > encoding='utf-8')" and then print that to see the characters. That > process causes the non-ASCII characters to display as the glyphs I > know them to be. > > If there was a place in the official docs for me to append these > nuggets of information to the sections for > "xml.etree.ElementTree.XML(text)" and > "xml.etree.ElementTree.dump(elem)" I would absolutely do so. I don't understand ... tostring() is in the same section as dump(), about two screen-heights away. You want to include the tostring() docs in the dump() docs? The usual idea is not to get bogged down in the first function that looks at first glance like it might do what you want ("look at the glyphs") but doesn't (it writes a (transportable) XML stream) but press on to the next plausible candidate. -- http://mail.python.org/mailman/listinfo/python-list