Hi, Steven D'Aprano wrote: > On Fri, 01 Feb 2008 00:40:01 +1100, Ben Finney wrote: > >> Quite apart from a human thinking it's pretty or not pretty, it's *not >> valid XML* if the XML declaration isn't immediately at the start of the >> document <URL:http://www.w3.org/TR/xml/#sec-prolog-dtd>. Many XML >> parsers will (correctly) reject such a document. > > You know, I'd really like to know what the designers were thinking when > they made this decision. [had a good laugh here] > This is legal XML: > > """<?xml version="1.0"?> > <greeting>Hello, world!</greeting>""" > > and so is this: > > """ > <greeting >Hello, world!</greeting >""" > > > but not this: > > """ <?xml version="1.0"?> > <greeting>Hello, world!</greeting>"""
It's actually not that stupid. When you leave out the declaration, then the XML is UTF-8 encoded (by spec), so normal ASCII whitespace doesn't matter. It's just like the declaration had come *before* the whitespace, at the very beginning of the byte stream. But if you add a declaration, then the encoding can change for the whole document (including the declaration!), so you have to give the parser a chance to actually parse the declaration. How is it supposed to know that the whitespace before the declaration *is* whitespace before it knows the encoding? Stefan -- http://mail.python.org/mailman/listinfo/python-list