Ulrich Eckhardt wrote: > Gilles Ganault wrote: >> I'm getting this error while downloading and parsing web pages: >> >> ===== >> title = m.group(1) >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position >> 48: ordinal not in range(128) >> ===== >> >> From what I understand, it's because some strings are Unicode, and >> hence contain characters that are illegal in ASCII. > > You just need to use a codec according to the encoding of the webpage. Take > a look at > http://wiki.python.org/moin/Python3UnicodeDecodeError > It is about Python 3, but the principles apply nonetheless. In any case, > throwing the error at a websearch will turn up lots of solutions. > I won't believe that statement is producing the error until I see a traceback. As far as I'm aware the re module can handle Unicode. Getting a UnicodeDecodeError in an assignment would be unusual to say the least. Though it's not, I suppose, impossible that calling the .group() method of a match object might, it seems unlikely.
regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list