On 30/11/2013 00:44, Steven D'Aprano wrote:
(5) What is the length of "😸😾"? Both characters U+1F636 (GRINNING CAT FACE WITH SMILING EYES) and U+1F63E (POUTING CAT FACE) are outside the Basic Multilingual Plane, which means they require more than two bytes each. Most programming languages using UTF-16 encodings internally (including Javascript and Java) fail this test. Python 3.3 passes: py> s = '😸😾' py> len(s) 2
I couldn't care less if it passes, it's too slow and uses too much memory[1], so please get the completely bug ridden Python 2 unicode implementation restored at the earliest possible opportunity :)
[1]because I say so although I don't actually have any evidence to support my case. :) :)
-- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list