willie wrote: > # What's the correct way to get the > # byte count of a unicode (UTF-8) string? > # I couldn't find a builtin method > # and the following is memory inefficient. > > ustr = "example\xC2\x9D".decode('UTF-8') > > num_chars = len(ustr) # 8 > > buf = ustr.encode('UTF-8') > > num_bytes = len(buf) # 9
num_bytes = len("example\xC2\x9D") This produces 9; isn't that what you want? If not, please explain, with examples, what you mean by "the byte count of a unicode (UTF-8) string". HTH, John -- http://mail.python.org/mailman/listinfo/python-list