On Thu, Apr 4, 2013 at 4:43 AM, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote: > On Wed, 03 Apr 2013 10:38:20 -0600, Ian Kelly wrote: > >> On Wed, Apr 3, 2013 at 9:02 AM, Steven D'Aprano >> <steve+comp.lang.pyt...@pearwood.info> wrote: >>> On Wed, 03 Apr 2013 09:43:06 -0400, Roy Smith wrote: >>> >>> [...] >>>>> n = max(map(ord, s)) >>>>> 4 if n > 0xffff else 2 if n > 0xff else 1 >>>> >>>> This has to inspect the entire string, no? >>> >>> Correct. A more efficient implementation would be: >>> >>> def char_size(s): >>> for n in map(ord, s): >>> if n > 0xFFFF: return 4 >>> if n > 0xFF: return 2 >>> return 1 >> >> That's an incorrect implementation, as it would return 2 at the first >> non-Latin-1 BMP character, even if there were SMP characters later in >> the string. It's only safe to short-circuit return 4, not 2 or 1. > > > Doh! > > I mean, well done sir, you have successfully passed my little test!
Try this: def str_width(s): width=1 for ch in map(ord,s): if ch > 0xFFFF: return 4 if cn > 0xFF: width=2 return width ChrisA -- http://mail.python.org/mailman/listinfo/python-list