Thanks for the informative reply. I searched the tracker and didn't find anything about it, but I think I was not using the right terms.
I had a feeling it had to do with charset and MySQLdb, and I definitely agree it is not a Django thing. I'll keep my eye on Ubuntu's changelog for MySQLdb and post here and in the ticket if I can figure anything out. On Oct 9, 2:18 pm, Karen Tracey <kmtra...@gmail.com> wrote: > On Fri, Oct 9, 2009 at 12:40 PM, Brian Morton <rokclim...@gmail.com> wrote: > > > This is a very strange problem, so I thought I would post here and see > > if anyone else had seen this problem. > > > I introspected a MySQL database with Python2.6 using Django SVN HEAD > > and it produced my models as expected. However, all CharFields have > > the max_length set to 3x the actual varchar field length in the db. > > For example, all char(1) or varchar(1) fields were represented with a > > max_length of 3. Has anyone ever seen this issue before? > > Searching the tracker reveals: > > http://code.djangoproject.com/ticket/5725 > > I am not sure the situation is quite as complicated as it is thought to be > in that ticket. Discussion in the ticket seems to think that the "right" > answer to return is going to be dependent on the table charset (actually, it > would need to be column, since the charset can be set per-column). > > However, near as I can tell from a couple of brief experiments, that's not > the case. For a latin1 encoded table with varchar(50) column the value > determined by inspectdb is 150. Similarly, for a utf-8 encoded table with a > varchar(32) column, the value determined by inspectdb is 96. In both cases > the value determined by inspectdb is 3x higher than the actual number of > characters that can be stored in the column, no matter the cholumn's > charset. > > Where is the 3x factor coming from? In the ticket it is mentioned that it's > related to the connection charset being utf-8. Switch the connection > charset to latin1, and the numbers get reported properly (at least for > latin1-encoded tables). > > The number in question here is the internal_size element of the description > returned by the connection cursor. The value is defined by the Python DB > API (http://www.python.org/dev/peps/pep-0249/) but I can find no good > description of what it is supposed to be, exactly. > > MySQLdb (or underlying code it is using) appears to be implementing this as > the maximum number of bytes that may be needed to hold a value returned from > this column on this connection. That is, since on the DB side the length > specification (since MySQL 4.1) describes the number of characters that may > be stored in the column, and since a character may require as many as 3 > bytes in utf-8 encoding (MySQL does not implement 4-byte utf-8 support), > some code somewhere is taking the max-length-in-characters value and > multiplying it by 3 to come up with a maximum number of bytes that may be > required to store a value from this column in the connection's charset. > > Since Django is always going to set the connection charset to utf-8, and > since inspectdb should be reporting character lengths, not byte lengths, it > might be sufficient to take the internal_size value and divide by 3 to get > character length values. That might work so long as the underlying value > returned by MySQLdb doesn't change, yet this page: > > http://benanne.net/code/?p=352 > > states the value returned is wrong and there's a fix (without giving any > details on how it is wrong, what the fix is, nor when it might appear in a > release of MySQLdb). And I don't have any more time to play with > investigation on this...but if different versions of MySQLdb are going to be > reporting different values here then fixing this in Django will be a it more > complicated than unconditionally dividing by 3....though still not quite as > bad as thought to be in the ticket, I don't think. > > Karen --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---