On Wed, 03 Apr 2013 10:38:20 -0600, Ian Kelly wrote: > On Wed, Apr 3, 2013 at 9:02 AM, Steven D'Aprano > <steve+comp.lang.pyt...@pearwood.info> wrote: >> On Wed, 03 Apr 2013 09:43:06 -0400, Roy Smith wrote: >> >> [...] >>>> n = max(map(ord, s)) >>>> 4 if n > 0xffff else 2 if n > 0xff else 1 >>> >>> This has to inspect the entire string, no? >> >> Correct. A more efficient implementation would be: >> >> def char_size(s): >> for n in map(ord, s): >> if n > 0xFFFF: return 4 >> if n > 0xFF: return 2 >> return 1 > > That's an incorrect implementation, as it would return 2 at the first > non-Latin-1 BMP character, even if there were SMP characters later in > the string. It's only safe to short-circuit return 4, not 2 or 1.
Doh! I mean, well done sir, you have successfully passed my little test! -- Steven -- http://mail.python.org/mailman/listinfo/python-list