On Wed, Mar 8, 2017 at 9:05 AM, John Nagle <na...@animats.com> wrote: > How do I test if a Python 2.7.8 build was built for 32-bit > Unicode? (I'm dealing with shared hosting, and I'm stuck > with their provided versions.) > > If I give this to Python 2.7.x: > > sy = u'\U0001f60f' > > len(sy) is 1 on a Ubuntu 14.04LTS machine, but 2 on the > Red Hat shared hosting machine. I assume "1" indicates > 32-bit Unicode capability, and "2" indicates 16-bit. > It looks like Python 2.x in 16-bit mode is using a UTF-16 > pair encoding, like Java. Is that right? Is it documented > somewhere?
That's correct. A narrow build will treat that as a pair of surrogates. You may also be able to check this way: >>> sys.maxunicode 1114111 > (Annoyingly, while the shared host has a Python 3, it's > 3.2.3, which rejects "u" Unicode string constants and > has other problems in the MySQL area.) Yeah, you'll do well to get a newer Py3 than that. Fortunately, any Linux old enough to be shipping 3.2 is likely to not depend on it in any way, so you can install a new Py3 (maybe even 3.6) and shadow the name "python3" with that. That's what I did when I was on Debian.... Squeeze, I think? and nothing newer than 3.2 was available. Soon as you hit 3.3, the u"..." prefix becomes legal again, and subsequent versions have added even more compatibility. ChrisA -- https://mail.python.org/mailman/listinfo/python-list