Gerhard Fiedler wrote: > If I understand you correctly, you are saying that if I distribute a file > with the following lines: > > s = "é" > print s > > I basically need to distribute also the information how the file is encoded > and every user needs to use the same (or a compatible) encoding for reading > this file?
if you put a, say, chr(233) in an 8-bit string literal in your source code, whoever runs your program will get a chr(233) byte (unless someone's recoded the file on the way; ordinary file copies and installation tools usually don't do that). how your program is treating that chr(233) is up to your program. to write robust and future-proof code, - use Unicode literals if you want to put non-ASCII *text* in Python string literals, and use a PEP 263-style coding directive to tell the parser what encoding your file is using: http://www.python.org/dev/peps/pep-0263/ - avoid putting non-ASCII characters in 8-bit literal strings; use escape sequences if you need to embed binary data in a string literal. also see the "lexical analysis" section in the language reference: http://pyref.infogami.com/lexical-analysis </F>
-- http://mail.python.org/mailman/listinfo/python-list