Diez B. Roggisch wrote > Stefan Behnel wrote: > >> [EMAIL PROTECTED] wrote: >>> how can I get the number of byte of the string in python? >>> with "len(string)", it doesn't work to get the size of the string in >>> bytes if I have the unicode string but just the length. (it only works >>> fine for ascii/latin1) In data structure, I have to store unicode >>> string for many languages and must know exactly how big of my string >>> which is stored so I can read back later. >> I do not quite know what you could possibly need that for, but AFAICT >> Python only uses two different unicode encodings depending on the >> platform. > > It is very important for relational databases, as these usually constrain > the amount of bytes per column - so you need the size of bytes, not the > number of unicode characters.
So then the easiest thing to do is: take the maximum length of a unicode string you could possibly want to store, multiply it by 4 and make that the length of the DB field. However, I'm pretty convinced it is a bad idea to store Python unicode strings directly in a DB, especially as they are not portable. I assume that some DB connectors honour the local platform encoding already, but I'd still say that UTF-8 is your best friend here. Stefan -- http://mail.python.org/mailman/listinfo/python-list