On Thu, Aug 6, 2009 at 12:41 PM, Robert Dailey<rcdai...@gmail.com> wrote: > On Aug 6, 11:31 am, "Richard Brodie" <r.bro...@rl.ac.uk> wrote: >> "Robert Dailey" <rcdai...@gmail.com> wrote in message >> >> news:29ab0981-b95d-4435-91bd-a7a520419...@b15g2000yqd.googlegroups.com... >> >> > UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in >> > position 1650: character maps to <undefined> >> >> > The file is defined as ASCII. >> >> That's the problem: ASCII is a seven bit code. What you have is >> actually ISO-8859-1 (or possibly Windows-1252). >> >> The different ISO-8859-n variants assign various characters to >> to '\xa9'. Rather than being Western-European centric and assuming >> ISO-8859-1 by default, Python throws an error when you stray >> outside of strict ASCII. > > Thanks for the help guys. Sorry I left out code, I wasn't sure at the > time if it would be helpful. Below is my code: > > > #======================================================== > def GetFileContentsAsString( file ): > f = open( file, mode='r', encoding='cp1252' ) > contents = f.read() > f.close() > return contents > > #======================================================== > def ReplaceVersion( file, version, regExps ): > #match = regExps[0].search( 'FILEVERSION 1,45332,2100,32,' ) > #print( match.group() ) > text = GetFileContentsAsString( file ) > print( text ) > > > As you can see, I am trying to load the file with encoding 'cp1252' > which, according to the python 3.1 docs, translates to windows-1252. I > also tried 'latin_1', which translates to ISO-8859-1, but this did not > work either. Am I doing something else wrong?
This is why we need code and full tracebacks. There's a good chance that your error is on the print(text) line. That's because sys.stdout is probably a byte stream without an encoding defined. When you try to print your unicode string, Python has to convert it to a stream of bytes. Python refuses to guess on the console encoding and just falls back to ascii, the conversion fails, and you get your error. Try using print( text.encode( 'cp1252' ) ) instead. > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list