Thanks. It was actually a combination of issues. The database was
UTF8, I should have added to my original post that I could manually
insert and retrieve UTF8 data.

The data we are pulling (migrating one system to a new one, built on
django) is a bit of a nest of encoding issues. So things that may look
like UTF8 may not be, etc.  So I think my attempts to encode this data
as UTF8 started the problem.

Thanks for the help and the general heads up on encoding and unicode
with django. I have read about it, but I understand it better each
time I encounter a problem with it.

--Jim

On May 24, 8:30 am, Karen Tracey <kmtra...@gmail.com> wrote:
> On Sun, May 23, 2010 at 10:10 PM, vjimw <im.a.machobea...@gmail.com> wrote:
> > I have been reading up on Unicode with Python and Django and I think I
> > have my code set to use UTF8 data when saving or updating an object
> > but I get an error on model.save()
>
> > My database and all of its tables are UTF8 encoded with UTF8 collation
> > (DEFAULT CHARSET=utf8;)
> > The data I am inputting is unicode
> > (u'Save up to 25% on your online order of select HP LaserJet\x92s')
> > <type 'unicode'>
>
> > But when I try to save this data I get an error
> > Incorrect string value: '\\xC2\\x92s' for column 'title' at row 1
>
> This error implies that your MySQL table is not set up the say you think it
> is, with a charset of utf8. Given a table that actually has a utf8 charset:
>
> k...@lbox:~/software/web/playground$ mysql -p Play2
> Enter password:
> Reading table information for completion of table and column names
> You can turn off this feature to get a quicker startup with -A
>
> Welcome to the MySQL monitor.  Commands end with ; or \g.
> Your MySQL connection id is 5852
> Server version: 5.0.67-0ubuntu6.1 (Ubuntu)
>
> Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
>
> mysql> show create table ttt_tag;
> +---------+---------------------------------------------------------------- 
> --------------------------------------------------------------------------- 
> ----------------------------------+
> | Table   | Create
> Table
> |
> +---------+---------------------------------------------------------------- 
> --------------------------------------------------------------------------- 
> ----------------------------------+
> | ttt_tag | CREATE TABLE `ttt_tag` (
>   `id` int(11) NOT NULL auto_increment,
>   `name` varchar(88) NOT NULL,
>   PRIMARY KEY  (`id`)
> ) ENGINE=MyISAM AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 |
> +---------+---------------------------------------------------------------- 
> --------------------------------------------------------------------------- 
> ----------------------------------+
> 1 row in set (0.00 sec)
>
> I can create an object in Django using the odd unicode character your
> string  includes (though I'm not sure what it is supposed to be -- based on
> its placement I'd guess it is supposed to be a registered trademark symbol
> but that's not what you actually have):
>
> k...@lbox:~/software/web/playground$ python manage.py shell
> Python 2.5.2 (r252:60911, Jan 20 2010, 23:16:55)
> [GCC 4.3.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> (InteractiveConsole)
>
> >>> from ttt.models import Tag
> >>> t = Tag.objects.create(name=u'HP LaserJet\x92s')
> >>> print t
> HP LaserJet s
> >>> quit()
>
> So that works, though the character does not print as anything useful.
>
> If I change the table to have a charset of latin1 (MySQL's default):
>
> mysql> drop table ttt_tag;
> Query OK, 0 rows affected (0.00 sec)
> mysql> create table ttt_tag (id int(11) not null auto_increment, name
> varchar(88) not null, primary key (id)) engine=myisam default charset
> latin1;
> Query OK, 0 rows affected (0.01 sec)
>
> I can then recreate the error you report:
>
> >>> t = Tag.objects.create(name=u'HP LaserJet\x92s')
>
> Traceback (most recent call last):
>   File "<console>", line 1, in <module>
> [snipped]
>   File "/usr/lib/python2.5/warnings.py", line 102, in warn_explicit
>     raise message
> Warning: Incorrect string value: '\xC2\x92s' for column 'name' at row 1
>
> So I think one problem is that your table is not actually set up the way you
> think it is.
>
> Another may be that you data is not really correct either. What you are
> showing that you have in your data is this character:
>
> http://www.fileformat.info/info/unicode/char/0092/index.htm
>
> and I suspect what you really want is either of these:
>
> http://www.fileformat.info/info/unicode/char/2122/index.htmhttp://www.fileformat.info/info/unicode/char/00ae/index.htm
>
> Either of these would display better than what you have:
>
> >>> u1 = u'LaserJet\u2122'
> >>> print u1
> LaserJet(tm)
> >>> u2 = u'LaserJet\xae'
> >>> print u2
>
> LaserJet(R)
>
> Karen
> --http://tracey.org/kmt/
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Django users" group.
> To post to this group, send email to django-us...@googlegroups.com.
> To unsubscribe from this group, send email to 
> django-users+unsubscr...@googlegroups.com.
> For more options, visit this group 
> athttp://groups.google.com/group/django-users?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to