Re: how to get size of unicode string/string in bytes ?

Stefan Behnel Tue, 01 Aug 2006 03:15:45 -0700

Diez B. Roggisch wrote
> Stefan Behnel wrote:
> 
>> [EMAIL PROTECTED] wrote:
>>>   how can I get the number of byte of the string in python?
>>> with "len(string)", it doesn't work to get the size of the string in
>>> bytes if I have the unicode string but just the length. (it only works
>>> fine for ascii/latin1) In data structure, I have to store unicode
>>> string for many languages and must know exactly how big of my string
>>> which is stored so I can read back later.
>> I do not quite know what you could possibly need that for, but AFAICT
>> Python only uses two different unicode encodings depending on the
>> platform.
> 
> It is very important for relational databases, as these usually constrain
> the amount of bytes per column - so you need the size of bytes, not the
> number of unicode characters.


So then the easiest thing to do is: take the maximum length of a unicode
string you could possibly want to store, multiply it by 4 and make that the
length of the DB field.

However, I'm pretty convinced it is a bad idea to store Python unicode strings
directly in a DB, especially as they are not portable. I assume that some DB
connectors honour the local platform encoding already, but I'd still say that
UTF-8 is your best friend here.

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to get size of unicode string/string in bytes ?

Reply via email to