Re: byte count unicode string

2006-09-22 Thread Paul Rubin
willie <[EMAIL PROTECTED]> writes: > >>> ustr = buf.decode('UTF-8') > >>> type(ustr) > > Is it a "unicode object that contains a UTF-8 encoded > string object?" No, it's just unicode, which is a string over a certain character set. UTF-8 is a way to encode unicode strings as byte strings. You

byte count unicode string

2006-09-20 Thread willie
>willie wrote: >> >> Thanks for the thorough explanation. One last question >> about terminology then I'll go away :) >> What is the proper way to describe "ustr" below? >> >>> ustr = buf.decode('UTF-8') >> >>> type(ustr) >> >> Is it a "unicode object that contains a UTF-8 encoded >>

Re: byte count unicode string

2006-09-20 Thread John Machin
willie wrote: > > Thanks for the thorough explanation. One last question > about terminology then I'll go away :) > What is the proper way to describe "ustr" below? > > >>> ustr = buf.decode('UTF-8') > >>> type(ustr) > > > > Is it a "unicode object that contains a UTF-8 encoded > string object?

Re: byte count unicode string

2006-09-20 Thread Gabriel Genellina
At Wednesday 20/9/2006 19:53, willie wrote: What is the proper way to describe "ustr" below? >>> ustr = buf.decode('UTF-8') >>> type(ustr) Is it a "unicode object that contains a UTF-8 encoded string object?" ustr is an unicode object. Period. An unicode object contains characters (not

Re: byte count unicode string

2006-09-20 Thread Virgil Dupras
MonkeeSage wrote: > OK, so the devil always loses. ;P > > Regards, > Jordan Huh? The devil always loses? *turns TV on, watches the news, turns TV off* Nope, buddy. Quite the contrary. -- http://mail.python.org/mailman/listinfo/python-list

byte count unicode string

2006-09-20 Thread willie
Martin v. Löwis: >willie schrieb: > >> Thank you for your patience and for educating me. >> (Though I still have a long way to go before enlightenment) >> I thought Python might have a small weakness in >> lacking an efficient way to get the number of bytes >> in a "UTF-8 encoded Python str

Re: byte count unicode string

2006-09-20 Thread Martin v. Löwis
willie schrieb: > Thank you for your patience and for educating me. > (Though I still have a long way to go before enlightenment) > I thought Python might have a small weakness in > lacking an efficient way to get the number of bytes > in a "UTF-8 encoded Python string object" (proper?), > but I've

byte count unicode string

2006-09-20 Thread willie
John Machin: >Good luck! Thank you for your patience and for educating me. (Though I still have a long way to go before enlightenment) I thought Python might have a small weakness in lacking an efficient way to get the number of bytes in a "UTF-8 encoded Python string object" (proper?), but I'v

Re: byte count unicode string

2006-09-20 Thread Diez B. Roggisch
willie wrote: > John Machin: > > >You are confusing the hell out of yourself. You say that your web app > >deals only with UTF-8 strings. Where do you get "the unicode string" > >from??? If name is a utf-8 string, as your comment says, then len(name) > >is all you need!!! > > > # I'll go ah

Re: byte count unicode string

2006-09-20 Thread John Machin
willie wrote: > John Machin: > > >You are confusing the hell out of yourself. You say that your web app > >deals only with UTF-8 strings. Where do you get "the unicode string" > >from??? If name is a utf-8 string, as your comment says, then len(name) > >is all you need!!! > > > # I'll go ahead

byte count unicode string

2006-09-20 Thread willie
John Machin: >You are confusing the hell out of yourself. You say that your web app >deals only with UTF-8 strings. Where do you get "the unicode string" >from??? If name is a utf-8 string, as your comment says, then len(name) >is all you need!!! # I'll go ahead and concede defeat since you

Re: byte count unicode string

2006-09-20 Thread John Machin
willie wrote: > >willie wrote: > >> Marc 'BlackJack' Rintsch: > >> > >> >In <[EMAIL PROTECTED]>, willie > wrote: > >> >> # What's the correct way to get the > >> >> # byte count of a unicode (UTF-8) string? > >> >> # I couldn't find a builtin method > >> >> # and the following is memory

Re: byte count unicode string

2006-09-20 Thread MonkeeSage
OK, so the devil always loses. ;P Regards, Jordan -- http://mail.python.org/mailman/listinfo/python-list

Re: byte count unicode string

2006-09-20 Thread Paul Rubin
Duncan Booth <[EMAIL PROTECTED]> writes: > I guess you could invent something like inserting a string into a database > which has fixed size fields, silently truncates fields which are too long > and stores the strings internally in utf-8 but only accepts ucs-2 in its > interface. Pretty far fet

Re: byte count unicode string

2006-09-20 Thread Duncan Booth
"MonkeeSage" <[EMAIL PROTECTED]> wrote: > John Machin wrote: >> The answer is, "You can't", and the rationale would have to be that >> nobody thought of a use case for counting the length of the UTF-8 form >> but not creating the UTF-8 form. What is your use case? > > Playing DA here, what if yo

Re: byte count unicode string

2006-09-20 Thread Diez B. Roggisch
MonkeeSage schrieb: > John Machin wrote: >> The answer is, "You can't", and the rationale would have to be that >> nobody thought of a use case for counting the length of the UTF-8 form >> but not creating the UTF-8 form. What is your use case? > > Playing DA here, what if you need to send the by

Re: byte count unicode string

2006-09-20 Thread MonkeeSage
John Machin wrote: > The answer is, "You can't", and the rationale would have to be that > nobody thought of a use case for counting the length of the UTF-8 form > but not creating the UTF-8 form. What is your use case? Playing DA here, what if you need to send the byte-count on a server via a he

byte count unicode string

2006-09-20 Thread willie
>willie wrote: >> Marc 'BlackJack' Rintsch: >> >> >In <[EMAIL PROTECTED]>, willie wrote: >> >> # What's the correct way to get the >> >> # byte count of a unicode (UTF-8) string? >> >> # I couldn't find a builtin method >> >> # and the following is memory inefficient. >> >> ustr =

Re: byte count unicode string

2006-09-20 Thread John Machin
willie wrote: > Marc 'BlackJack' Rintsch: > > >In <[EMAIL PROTECTED]>, willie wrote: > >> # What's the correct way to get the > >> # byte count of a unicode (UTF-8) string? > >> # I couldn't find a builtin method > >> # and the following is memory inefficient. > > >> ustr = "example\xC2\x9D"

byte count unicode string

2006-09-20 Thread willie
Marc 'BlackJack' Rintsch: >In <[EMAIL PROTECTED]>, willie wrote: >> # What's the correct way to get the >> # byte count of a unicode (UTF-8) string? >> # I couldn't find a builtin method >> # and the following is memory inefficient. >> ustr = "example\xC2\x9D".decode('UTF-8') >> num_chars

Re: byte count unicode string

2006-09-19 Thread Marc 'BlackJack' Rintsch
In <[EMAIL PROTECTED]>, willie wrote: > # What's the correct way to get the > # byte count of a unicode (UTF-8) string? > # I couldn't find a builtin method > # and the following is memory inefficient. > > ustr = "example\xC2\x9D".decode('UTF-8') > > num_chars = len(ustr)# 8 > > buf = ustr.

Re: byte count unicode string

2006-09-19 Thread John Machin
willie wrote: > # What's the correct way to get the > # byte count of a unicode (UTF-8) string? > # I couldn't find a builtin method > # and the following is memory inefficient. > > ustr = "example\xC2\x9D".decode('UTF-8') > > num_chars = len(ustr)# 8 > > buf = ustr.encode('UTF-8') > > num_byte

byte count unicode string

2006-09-19 Thread willie
# What's the correct way to get the # byte count of a unicode (UTF-8) string? # I couldn't find a builtin method # and the following is memory inefficient. ustr = "example\xC2\x9D".decode('UTF-8') num_chars = len(ustr)# 8 buf = ustr.encode('UTF-8') num_bytes = len(buf) # 9 # Thanks.