willie wrote: > John Machin: > > >You are confusing the hell out of yourself. You say that your web app > >deals only with UTF-8 strings. Where do you get "the unicode string" > >from??? If name is a utf-8 string, as your comment says, then len(name) > >is all you need!!! > > > # I'll go ahead and concede defeat since you appear to be on the > # verge of a heart attack :) > # I can see that I lack clarity so I don't blame you.
Could you please change your style of quoting/posting? It is extremely confusing - not only using a different character than > for citations, but also appearing to cite yourself while in fact it is your answer one reads. I'm all for expressing oneself and proving to be an individual - but communication can get tricky even with standardized manners of doing so, and there is no need to add more confusion. > # By UTF-8 string, I mean a unicode object with UTF-8 encoding: > > type(ustr) > <type 'unicode'> > >>> repr(ustr) > "u'\\u2708'" You ARE confusing the hell out of yourself. There is no such thing as a unciode object with UTF-8 encoding. There are unicode objects. And there are byte-strings, which may happen to represent text encoded in utf-8. What you see above is a unicode code point literal - which is translated to a certain utf-8 string, that looks suspiciously alike because of the way utf-8 defines the mapping between the code-points of unicode to utf-8. But it still remains true: a unicode object is a unicode object. And has no encoding whatsoever! > # The database API expects unicode objects: > # A template query, then a variable number of values. > # Perhaps I'm a victim of arbitrary design decisions :) The same happens in java all the time, as java only deals with unicode strings. And for dealing with it, you also need to explicitly convert them to the proper encoded byte array. Unfortunate, but true. Diez -- http://mail.python.org/mailman/listinfo/python-list