E. Paine <paineeli...@gmail.com> added the comment:
Sorry, the point I was trying to make was that, unlike UTF-8, Tcl doesn't support variable length characters and they are instead fixed at 16 bits (by default). So, while Python and UTF-8 are perfectly happy with the emoji, unless Tcl is compiled with a particular build flag it will not process the character correctly (hence why I said it was surprising that Chip showed at all). I have tested on Tcl 8.6.10 and encountered the same problem described. A further quote (granted, also old, but I cannot find anything to suggest this behaviour has been changed): "Tcl can (currently) only represent characters within the Basic Multilingual Plane of Unicode, so there's no way that you can even feed an U+10000 into encoding convertto :-(. Fixing that is non-trivial, since some parts of Tcl (the C library) require a representation of strings where all characters take up the same number of bytes. It is possible to compile Tcl with that "number of bytes" set to 4 (meaning 32 bits per character), but it's rather wasteful, and has been reported not entirely compatible with Tk." [https://wiki.tcl-lang.org/page/utf-8] If I can find the build flag mentioned, I will post it here for future reference. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue41212> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com