Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> writes: > result = text[end:]
if end not near the end of the original string, then this is O(N) even with fixed-width representation, because of the char copying. if it is near the end, by knowing where the string data area ends, I think it should be possible to scan backwards from the end, recognizing what bytes can be the beginning of code points and counting off the appropriate number. This is O(1) if "near the end" means "within a constant". > You could say "Screw the full Unicode standard, who needs more than 64K No if you're claiming the language supports unicode it should be the whole standard. > You could do what Python 3.2 narrow builds do: use UTF-16 and leave it > up to the individual programmer to track character boundaries, I'm surprised the Python 3 implementers even considered that approach much less went ahead with it. It's obviously wrong. > You could add a whole lot more heavyweight infrastructure to strings, > turn them into suped-up ropes-on-steroids. I'm not persuaded that PEP 393 isn't even worse. -- http://mail.python.org/mailman/listinfo/python-list