On Monday 13 September 2010, it occurred to Robert Kern to exclaim: > On 9/13/10 2:00 PM, Stef Mientki wrote: > > On 12-09-2010 19:28, Robert Kern wrote: > >> On 9/12/10 4:14 AM, Stef Mientki wrote: > >>> hello, > >>> > >>> Is it possible to get the encoding of a python file from the first > >>> source line, (if there's any), > >>> after importing it ( with '__import__' ) > >>> > >>> # -*- coding: windows-1252 -*- > >> > >> The regular expression used to match the encoding declaration is given > >> here: > >> > >> http://docs.python.org/reference/lexical_analysis.html#encoding-declarat > >> ions > > > > yes, but then I've to read the first line of the file myself. > > > > In the meanwhile I found another (better ?) solution, (I'm using Python > > 2.6) > > > > > > Place these 2 lines at the top of the file > > # -*- coding: windows-1252 -*- > > from __future__ import unicode_literals > > > > or these > > # -*- coding: utf-8 -*- > > from __future__ import unicode_literals > > > > then you always get the correct unicode string back. > > Ah. I see. You don't actually need to know the encoding; you just want to > use literals with raw, unescaped characters embedded in them. > > This may interfere with the cases when you need a real str object.
If you assume that unicode_literals from __future__ works, you can also assume that the b'bytestring' syntax works. > In > Python 2.x, if you want a unicode literal, just use one like so: u'ß'. As > long as the encoding declaration is correct, this will work just fine. -- http://mail.python.org/mailman/listinfo/python-list