Hm yes, that is true. In Firefox on the other hand, the response header is "Content-Type text/xml; charset=UTF-8"
On Sat 17, 13:16 -0700, Mark Tolonen wrote: > > "Diez B. Roggisch" <de...@nospam.web.de> wrote in message > news:7jub5rf37div...@mid.uni-berlin.de... > [snip] > >This is wierd. I looked at the site in FireFox - and it was > >displayed correctly, including umlauts. Bringing up the > >info-dialog claims the page is UTF-8, the XML itself says so as > >well (implicit, through the missing declaration of an encoding) - > >but it clearly is *not* utf-8. > > > >One would expect google to be better at this... > > > >Diez > > According to the XML 1.0 specification: > > "Although an XML processor is required to read only entities in the > UTF-8 and UTF-16 encodings, it is recognized that other encodings > are used around the world, and it may be desired for XML processors > to read entities that use them. In the absence of external character > encoding information (such as MIME headers), parsed entities which > are stored in an encoding other than UTF-8 or UTF-16 must begin with > a text declaration..." > > So UTF-8 and UTF-16 are the defaults supported without an xml > declaration in the absence of external encoding information. But we > have external character encoding information: > > >>>f = urllib.urlopen("http://www.google.de/ig/api?weather=Muenchen") > >>>f.headers.dict['content-type'] > 'text/xml; charset=ISO-8859-1' > > So the page seems correct. > > -Mark > > > -- > http://mail.python.org/mailman/listinfo/python-list -- -- http://mail.python.org/mailman/listinfo/python-list