MonkeeSage schrieb: > John Machin wrote: >> The answer is, "You can't", and the rationale would have to be that >> nobody thought of a use case for counting the length of the UTF-8 form >> but not creating the UTF-8 form. What is your use case? > > Playing DA here, what if you need to send the byte-count on a server > via a header, but need the utf8 representation for the actual data?
So what - you need it in the end, don't you? The runtime complexity of the calculation will be the same - you have to consider each character, so its O(n). Of course you will roughly double the memory consumption - the original unicode being represented as UCS2 or UCS4. But then - if that really is a problem, how would you work with that string anyway? So you have to resort to slicing and computing the size of the parts, which will remedy that easily. Diez -- http://mail.python.org/mailman/listinfo/python-list