On Thu, Jul 23, 2015 at 3:58 PM, dieter <die...@handshake.de> wrote: > Steven D'Aprano <st...@pearwood.info> writes: >> On Wed, 22 Jul 2015 08:17 pm, anatoly techtonik wrote: >>> Is there a way to know encoding of string (bytes) literal >>> defined in source file? For example, given that source: >>> >>> # -*- coding: utf-8 -*- >>> from library import Entry >>> Entry("текст") >>> >>> Is there any way for Entry() constructor to know that >>> string "текст" passed into it is the utf-8 string? >> ... >> The right way to deal with this is to use an actual Unicode string: >> >> Entry(u"текст") >> >> and make sure that the file is saved using UTF-8, as the encoding cookie >> says. > > In order to follow this recommendation, is there an easy way to > learn about the "encoding cookie"'s value -- rather than parsing > the first two lines of the source file (which may not always be available).
No; you don't need to. If you use a Unicode string literal (as marked by the u"..." notation), the Python compiler will handle the decoding for you. The string that's passed to Entry() will simply be a string of Unicode codepoints - no encoding information needed. If you then want that in UTF-8, you can encode it explicitly. ChrisA -- https://mail.python.org/mailman/listinfo/python-list