On Mon, Sep 1, 2008 at 4:23 PM, Max <[EMAIL PROTECTED]> wrote: > > Hi, > I ve been trying to figure out why I can t handle utf-8 properly on my > production server, while it works perfectly on my local django dev > server > > When I try to run this line > unicode_data.decode("utf-8") >
What is 'unicode_data' here? A model field? > > With data coming from DB > " Rémi " > > I get on production only > 'ascii' codec can't encode character u'\xe9' > I am confused why you are trying to decode() data coming from the DB. You decode() a string to turn it into a unicode object, but assuming you are running a Django from post-Unicode branch merge (which '0.97' could be, but '0.97' is not actually a release, just some level from SVN after 0.96 so it is hard to be sure), the data should already be unicode when you see it. The error you are getting on production is what I would expect for the case where the data read from the DB has already been transformed from string to unicode: Python 2.5.1 (r251:54863, Jul 31 2008, 23:17:40) [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> s = 'Rémi' >>> type(s) <type 'str'> >>> u1 = s.decode('utf-8') >>> type(u1) <type 'unicode'> >>> u1.decode('utf-8') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "encodings/utf_8.py", line 16, in decode UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128) >>> So my question would not be why is this failing on production but why is it working on development? The one case I know of where data is returned as strings instead of unicode for MySQL is when you have set a binary collation on the column in the database, but from what you list below you have not done that. You are running an alpha version of MySQLdb on development, so perhaps that is causing trouble. To debug further, please elaborate on what 'unicode_data' is (a model field? if so, provide the model field definition). It would be nice to know what type it is on each machine -- if it is a model field it should be <type 'unicode'>, assuming it's a character -type field. And you should not have to be calling decode() on it at all. Karen > > I spent a lot of time on the forums and I could fixed that rather > easily on my dev server with the addition of DEFAULT_CHARSET = 'utf-8' > in settings.py and re-defining collations and charsets in DB. > But for some reason there is still some ascii conversion in production > and it s driving me Crazy!!! > I tried it all, I don t know what to do next. Please help! > > > DEV ENV > > ------------------------------------------------------------------------------ > python 2.5.1 > Mysql: 6.0.3-alpha-community MySQL Community Server (GPL) > django: 0.97 > os: vista > > > PROD ENV > > ------------------------------------------------------------------------------ > python 2.4.4 > Mysql: 5.0.32-Debian_7etch1-log Debian etch distribution > django: 0.97 > os: linux > > > DEV DB > > ------------------------------------------------------------------------------- > character_set_client utf8 > character_set_connection utf8 > character_set_database latin1 > character_set_filesystem binary > character_set_results utf8 > character_set_server latin1 > character_set_system utf8 > character_sets_dir C:\\Program Files\\MySQL\\MySQL Server > 6.0\\share\\charsets\\ > > collation_connection utf8_general_ci > collation_database latin1_swedish_ci > collation_server latin1_swedish_ci > > On every table i did a > alter table tablename CONVERT TO CHARACTER SET utf8 collate > utf8_general_ci > > > PROD DB > > ------------------------------------------------------------------------------- > character_set_client utf8 > character_set_connection utf8 > character_set_database utf8 > character_set_filesystem binary > character_set_results utf8 > character_set_server utf8 > character_set_system utf8 > character_sets_dir /usr/share/mysql/charsets/ > > collation_connection utf8_general_ci > collation_database utf8_general_ci > collation_server utf8_unicode_ci > > On every table i did a > alter table tablename CONVERT TO CHARACTER SET utf8 collate > utf8_general_ci > > > DEV SERVER: django dev server > > ----------------------------------------------------------------------------------- > no special setting > > > PROD SERVER : apache2 > > ----------------------------------------------------------------------------------- > AddDefaultCharset utf8 > > > DJANGO SETTINGS (prod and serv) > > ----------------------------------------------------------------------------------- > DEFAULT_CHARSET = 'utf-8' > TIME_ZONE = 'America/New York' > LANGUAGE_CODE = 'en-us' > USE_I18N = True > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---