Thanks for the informative reply.  I searched the tracker and didn't
find anything about it, but I think I was not using the right terms.

I had a feeling it had to do with charset and MySQLdb, and I
definitely agree it is not a Django thing.  I'll keep my eye on
Ubuntu's changelog for MySQLdb and post here and in the ticket if I
can figure anything out.

On Oct 9, 2:18 pm, Karen Tracey <kmtra...@gmail.com> wrote:
> On Fri, Oct 9, 2009 at 12:40 PM, Brian Morton <rokclim...@gmail.com> wrote:
>
> > This is a very strange problem, so I thought I would post here and see
> > if anyone else had seen this problem.
>
> > I introspected a MySQL database with Python2.6 using Django SVN HEAD
> > and it produced my models as expected.  However, all CharFields have
> > the max_length set to 3x the actual varchar field length in the db.
> > For example, all char(1) or varchar(1) fields were represented with a
> > max_length of 3.  Has anyone ever seen this issue before?
>
> Searching the tracker reveals:
>
> http://code.djangoproject.com/ticket/5725
>
> I am not sure the situation is quite as complicated as it is thought to be
> in that ticket.  Discussion in the ticket seems to think that the "right"
> answer to return is going to be dependent on the table charset (actually, it
> would need to be column, since the charset can be set per-column).
>
> However, near as I can tell from a couple of brief experiments, that's not
> the case.  For a latin1 encoded table with varchar(50) column the value
> determined by inspectdb is 150.  Similarly, for a utf-8 encoded table with a
> varchar(32) column, the value determined by inspectdb is 96.  In both cases
> the value determined by inspectdb is 3x higher than the actual number of
> characters that can be stored in the column, no matter the cholumn's
> charset.
>
> Where is the 3x factor coming from?  In the ticket it is mentioned that it's
> related to the connection charset being utf-8.  Switch the connection
> charset to latin1, and the numbers get reported properly (at least for
> latin1-encoded tables).
>
> The number in question here is the internal_size element of the description
> returned by the connection cursor.  The value is defined by the Python DB
> API (http://www.python.org/dev/peps/pep-0249/) but I can find no good
> description of what it is supposed to be, exactly.
>
> MySQLdb (or underlying code it is using) appears to be implementing this as
> the maximum number of bytes that may be needed to hold a value returned from
> this column on this connection.  That is, since on the DB side the length
> specification (since MySQL 4.1) describes the number of characters that may
> be stored in the column, and since a character may require as many as 3
> bytes in utf-8 encoding (MySQL does not implement 4-byte utf-8 support),
> some code somewhere is taking the max-length-in-characters value and
> multiplying it by 3 to come up with a maximum number of bytes that may be
> required to store a value from this column in the connection's charset.
>
> Since Django is always going to set the connection charset to utf-8, and
> since inspectdb should be reporting character lengths, not byte lengths, it
> might be sufficient to take the internal_size value and divide by 3 to get
> character length values.  That might work so long as the underlying value
> returned by MySQLdb doesn't change, yet this page:
>
> http://benanne.net/code/?p=352
>
> states the value returned is wrong and there's a fix (without giving any
> details on how it is wrong, what the fix is, nor when it might appear in a
> release of MySQLdb).  And I don't have any more time to play with
> investigation on this...but if different versions of MySQLdb are going to be
> reporting different values here then fixing this in Django will be a it more
> complicated than unconditionally dividing by 3....though still not quite as
> bad as thought to be in the ticket, I don't think.
>
> Karen
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to