Hi everybody. I've played for few hours with encoding in py, but it's still somewhat confusing to me. So I've written a test file (encoded as utf-8). I've put everything I think is true in comment at the beginning of script. Could you check if it's correct (on side note, script does what I intended it to do).
One more thing, is there some mechanism to avoid writing all the time 'something'.decode('utf-8')? Some sort of function call to tell py interpreter that id like to do implicit decoding with specified encoding for all string constants in script? Here's my script: ------------------- # vim: set encoding=utf-8 : """ ----- encoding and py ----- - 1st (or 2nd) line tells py interpreter encoding of file - if this line is missing, interpreter assumes 'ascii' - it's possible to use variations of first line - the first or second line must match the regular expression "coding[:=]\s*([-\w.]+)" (PEP-0263) - some variations: ''' # coding=<encoding name> ''' ''' #!/usr/bin/python # -*- coding: <encoding name> -*- ''' ''' #!/usr/bin/python # vim: set fileencoding=<encoding name> : ''' - this version works for my vim: ''' # vim: set encoding=utf-8 : ''' - constants can be given via str.decode() method or via unicode constructor - if locale is used, it shouldn't be set to 'LC_ALL' as it changes encoding """ import datetime, locale #locale.setlocale(locale.LC_ALL,'croatian') # changes encoding locale.setlocale(locale.LC_TIME,'croatian') # sets correct date format, but encoding is left alone print 'default locale:', locale.getdefaultlocale() s='abcdef ČčĆćĐ𩹮ž'.decode('utf-8') ss=unicode('ab ČćŠđŽ','utf-8') # date part of string is decoded as cp1250, because it's default locale all=datetime.date(2000,1,6).strftime("'%d.%m.%Y.', %x, %A, %B, ").decode('cp1250')+'%s, %s' % (s, ss) print all ------------------- -- http://mail.python.org/mailman/listinfo/python-list