> So then the easiest thing to do is: take the maximum length of a unicode > string you could possibly want to store, multiply it by 4 and make that > the length of the DB field. > However, I'm pretty convinced it is a bad idea to store Python unicode > strings directly in a DB, especially as they are not portable. I assume > that some DB connectors honour the local platform encoding already, but > I'd still say that UTF-8 is your best friend here.
It was your assumption that the OP wanted to store the "real" unicode-strings. A moot point anyway, at it is afaik not possible to get their contents in byte form (except from a C-extension). And assuming 4 bytes per character is a bit dissipative I'd say - especially when you have some > 80% ascii-subset in your text as european and american languages have. The solution was given before: chose an encoding (utf-8 is certainly the most favorable one), and compute the byte-string length. Diez -- http://mail.python.org/mailman/listinfo/python-list