Chris Angelico <ros...@gmail.com>: > Actually, the implementation I detailed was far SIMPLER than I thought > it would be; I started writing that post trying to prove that it was > impossible, but it turns out it isn't actually impossible. Just highly > impractical.
The existing str implementation could be tweaked to accommodate the "super code points" I proposed: Add a pointer field to CPython's UCS-4 string variant. Behind the pointer is an array of 64-bit pointers. If any string code point is 1114112 or greater, subtract 1114112 from it to get an index into the pointer array. If the pointer at the index is odd, cast it into uint64_t and shift right by one bit to get the super code point. Such a packed super code point can hold 3 full code points (3 * 21 bits). If the pointer at the index is an even number, it is a reference to a bigint value representing the super code point. Marko -- https://mail.python.org/mailman/listinfo/python-list