Basically, I think Twitter's broken. For my full discusion on the matter, see:
Here's the first post of mine, ineffectually edited for this list: """ <strikethrough>The obvious solution [to getting the length of a tweet] is wrong. Like, slightly wrong¹.</strikethrough> Given tweet = b"caf\x65\xCC\x81".decode(): >>> tweet 'café' But: >>> len(tweet) 5 So the solution is: >>> import unicodedata >>> len(unicodedata.normalize("NFC", tweet)) 4 <strikethrough>Read twitter's commentary¹ for proof.</strikethrough> <strikethrough>There are additional complications I'm trying to sort out.</strikethrough> ________________________________ After further testing (I don't actually use Twitter) it seems the whole thing was just smoke and mirrors. The linked article is a lie, at least on the user's end. On Linux you can prove this by running: >>> p = subprocess.Popen(['xsel', '-bi'], stdin=subprocess.PIPE) >>> p.communicate(input=b"caf\x65\xCC\x81") (None, None) "café" will be in your Copy-Paste buffer, and you can paste it in to the tweet-box. It takes 5 characters. So much for testing ;). ________________________________ ¹ """ I know this isn't *really* Python-related, but there's Python involved and you're the sort of people who'll be able to tell me what I've done wrong, if anything. --