> Is it a bug in unicodedata, or is this the expected behaviour on a > narrow build?
It's a bug. It should either raise an exception, or return the correct result. If you know feel like submitting a bug report: please try to come up with a patch instead. > Another problem I have is to access the "characters" and their > properties by the respective codepoints: under FFFF it is possible, > to use unichr(), which isn't valid for higher valules on a narrow > build It is possible to derive the codepoint from the surrogate pair, > which would be usable also for wider codepoints. See PEP 261. This is by design. > Currently, I'm using a kind of parallel database for some unicode > ranges above FFFF, but I don't think, this is the most effective way. Just use a wide Unicode build instead. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list