Re: Regular expression

Soltys Fri, 20 Jun 2008 01:21:42 -0700

Duncan Booth pisze:

Sallu <[EMAIL PROTECTED]> wrote:
string = 'riché'
...
unicode(string)).encode('ASCII', 'ignore')
...
Output :
sys:1: DeprecationWarning: Non-ASCII character '\xc3' in file regu.py
on line 4, but no encoding declared; see
http://www.python.org/peps/pep-0263.html for details
riché
Traceback (most recent call last):
  File "regu.py", line 13, in ?
    msg=strip_accents(string)
  File "regu.py", line 10, in strip_accents
    return unicodedata.normalize('NFKD',
unicode(string)).encode('ASCII', 'ignore')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
4: ordinal not in range(128)
The problem is the expression: unicode(string) which is equivalent tosaying string.decode('ascii')
The string contains a non-ascii character so the decode fails. You shouldspecify whatever encoding you used for the source file. From the errormessage it looks like you used utf-8, so "string.decode('utf-8')" shouldgive you a unicode string to work with.


Or just specify source encoding like that:
#!/usr/bin/python
# -*- coding: utf-8 -*-

or

#!/usr/bin/python
# coding=utf-8



--
Soltys

"Free software is a matter of liberty not price"
--
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expression

Reply via email to