Wade Leftwich wrote:
> Gábor Farkas wrote:
>> Niran Babalola wrote:
>>> I've been developing a Django application using SQLite, and now I'm
>>> trying to move over to MySQL and actually launch the site. The
>>> application is storing data from RSS/Atom feeds using Universal Feed
>>> Parser, which uses unicode strings for all its data. When I try to
>>> store information from a feed into my MySQL database, I get the
>>> following error:
>>>
>>> "UnicodeEncodeError: 'latin-1' codec can't encode characters in
>>> position 75-76: ordinal not in range(256)"
>>>
>>> Everything worked fine with SQLite, but I haven't been able to get past
>>> this problem with MySQL. I tried dropping the database and recreating
>>> it with utf8 as the default encoding, but that didn't help either. Any
>>> ideas?
>> hi,
>>
>> please give us the whole stacktrace, and the line of the code where this
>> happens.
>>
>> generally, this might happen because you're trying to directly put
>> unicode text into the database (i have no experience with mysql, so i
>> can be wrong here). the various database-backends react differently to
>> save-unicode-data (sqlite3 usually works ok, psycopg1 fails, psycopg2
>> works etc.).
>>
>> so, before saving the data, convert it explicitly to byte-strings with a
>> suitable charset (i assume you're using utf8).
>>
>> gabor
> 
> MySQL defaults to latin-1 encoding, though in version 5 you can specify
> UTF-8. So as Gabor says, you have to convert explicitly to byte-strings
> encoded so your db will accept them. From utf-8 to latin1:
> 
> def utf8tolatin1(s):
>     return s.decode('utf-8', 'ignore').encode('latin1',
> 'xmlcharrefreplace')
> 
> In [8]: u'\u2014'.encode('utf-8')
> Out[8]: '\xe2\x80\x94'
> 
> In [9]: utf8tolatin1('\xe2\x80\x94')
> Out[9]: '—'
> 
> Also see:
> http://www.oreillynet.com/onlamp/blog/2006/01/turning_mysql_data_in_latin1_t.html
> 

thanks for the info..

to the original-poster:
please note, that this way you might lose some character-data.. there's 
no better way, you simply cannot store every unicode character in latin1.

gabor

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users
-~----------~----~----~----~------~----~------~--~---

Reply via email to