On Thu, Apr 4, 2013 at 12:43 AM, Roy Smith <r...@panix.com> wrote: > This has to inspect the entire string, no? I posted (essentially) this > a few days ago: > > if all(ord(c) <= 0xffff for c in s): > return "it's all bmp" > else: > return "it's got astral crap in it" > > I'm reasonably sure all() is smart enough to stop at the first False > value.
Probably, but it still has to scan the body of the string. It'd not be too bad if it's all astral, but if it's all BMP, it has to scan the whole string. In the max() case, it has to scan the whole string anyway, as there's no other way to determine the maximum. I'm thinking here of this function: http://pike.lysator.liu.se/generated/manual/modref/ex/7.2_3A_3A/String/width.html It's implemented as a simple lookup into the header. (Pike strings, like PEP 393 strings, are stored in the most compact way possible - 1, 2, or 4 bytes per character - with a conceptually similar header structure.) Is this something that would be worth having available? Should I post an issue about it? ChrisA more for self-ref than anyone else's: source of Pike's String.width(): http://pike-git.lysator.liu.se/gitweb.cgi?p=pike.git;a=blob;f=src/builtin.cmod;hb=HEAD#l1077 -- http://mail.python.org/mailman/listinfo/python-list