BerlinBrown wrote: > With this code, ignore/replace still generate an error > > # Encode to simple ascii format. > field.full_content = field.full_content.encode('ascii', > 'replace') > > Error: > > [0/1] 'ascii' codec can't decode byte 0xe2 in position 14317: ordinal > not in ran > ge(128) > > The document in question; is a wikipedia document. I believe they use > latin-1 unicode or something similar. I thought replace and ignore > were supposed to replace and ignore?
Is field.full_content a str or a unicode? You probably haven't decoded it from a byte string yet. >>> field.full_content = field.full_content.decode('utf8', 'replace') >>> field.full_content = field.full_content.encode('ascii', 'replace') Why do you want to use ASCII? UTF-8 is great. :-) -- -- http://mail.python.org/mailman/listinfo/python-list