https://bugs.kde.org/show_bug.cgi?id=395171
--- Comment #8 from Egmont Koblinger <egm...@gmail.com> --- (In reply to Jayadevan from comment #7) I stopped working on terminal emulation about a year ago. Yet, I'm making a single exception here to respond (i.e. I most likely won't follow up, don't bother writing in order to expect a response from me). > Please reject such proposals, as those are discriminatory. I firmly refute this claim. There is nothing discriminatory in the proposal whatsoever. The reason behind this request – and this should be obvious to everyone who takes time to really _understand_ the post and the linked article – is that UTF-16 (and a few friends) as the _I/O_ encoding *does not work*, *never worked* and even more importantly, *cannot be fixed to work*. More precisely, you can write a terminal emulator that speaks this encoding, but when placed in its context (i.e. surrounded by a Unix kernel, libc, higher level libraries, tools, apps, tmux-likes, other computers to ssh to/from, etc.) it won't do anything that makes sense, since all the surrounding infrastructure only support ASCII-compatible encodings for the communication with the terminal. In order to support UTF-16 as the _I/O_ encoding, in a way that you actually get a working ecosystem around the terminal with this encoding, you'd need modifications to the kernel's tty handling (line discipline, stty special characters etc.), the kernel's tty-accessing API (to enforce UTF-16, or at least an even number of bytes on all opertaions that write to / read from a tty, or work with 16-bit units instead of 8-bit ones, in order to exclude the possibility of going out of sync, causing permanent breakages), accompanied with the corresponding changes in standards (e.g. POSIX), you'd need these changes in libc too, you'd need heavy modifications in all the apps (e.g. change from '\0'-terminated byte strings to wide strings or whatnot); you'd need to throw out any shell script that contains even an "echo foo" (in an ASCII-compatible encoding) beacuse that would outright break the terminal if sent out as-is, you'd need to rethink "cat" (how to transfer potentially odd number of bytes into a channel that expects even numbers), you'd need to add UTF-16 locales, and so on and so forth... I just sketched up a tiny subset of the problems. You'd need to essentially rethink and adjust all the APIs, libraries, every single tool or application inside the terminal, literally everything. All these in order to create a system that's utterly incompatible with what we already have, and regarding the user-visible outcome is not any tad bit better. It's clearly not going to happen, and even if happened, would be clearly harmful. There is no politics or discrimination at all here, this is purely technical. > UTF-8 is Anglo-centric. UTF-16 treats each writing system more fairly. UTF-8 can represent the exact same things as UTF-16. They support all writing systems to the very same extent. The only sense in which one can perhaps claim that UTF-8 is Anglo-centric, is that it uses 1 byte for English letters vs. 3 bytes for CJK (Chinese, Japanese, Korean) symbols; whereas UTF-16 uses 2 for both. Given that an English letter represents, well, a single letter of a word, whereas a CJK symbol represents a syllable or an entire word, I actually do think UTF-8's 1:3 split is a way more fair system. (Let alone that the typical work happening inside a terminal is usually English-centric.) By the way: who cares? With today's network speeds, combined with the tiny amount of terminal data compared to any other activity you do over any network, the difference in the byte count just simply does not matter at all. > Since KDE Internally uses UTF-16, UTF-16 should be supported. Trying to make a connection between the _internal_ encoding and the _I/O_ encoding is not justified at all. As an occasional user of Konsole I don't have the slightest idea what encoding it uses _internally_, and it should be this way. Users shouldn't care, users shouldn't need to care. If users needed to care, it would mean that the developers did a terrible job. The internal encoding is subject to change by the developers at any time, without any user noticing it. What _I/O_ encodings Konsole supports (or, in this case: incorrectly claims to support) is an utterly independent story. > Also, UTF-16 is used by KDE, QT, C/C++ (From ICU), Java, Windows, > JavaScript, Android, DartVM, Dart Language, and modern frameworks > like Flutter. You see: they made a choice. They don't offer alternatives, they decided on one encoding. The same goes for terminals. They decided on UTF-8; unsurprisingly, since for millions of technical reasons, the encoding needs to be ASCII-compatible, whereas there's a natural need to encode any text. Many modern terminal emulators only support UTF-8 encoding and nothing else. Many other terminal emulators support some legacy deprecated ones for backwards compatibility, back from the days when the world hadn't settled on UTF-8, but those at least work. And then there's Konsole offering some choices that never worked, don't work, will never work due to millions of technical issues. The direction is not to offer alternatives senselessly. Especially not if such an alternative would require to redesign and rewrite pretty much every single component of the ecosystem. The direction is one single mode of operation that is perfect for everybody. As for the terminals' _I/O_ encoding, this is UTF-8. No culture, no language, no writing system, no human being was discriminated by this choice. The current UTF-8 approach supports everything that the UTF-16 approach, if was reasonable and feasible to implement – which it is not –, could support. Choosing one technical solution over the other – even if the other was viable too, which is not the case here – is not discrimination. It is proper engineering. The current bugreport is about the removal of a claimed feature that doesn't work, never worked, and cannot be made to work. -- You are receiving this mail because: You are watching all bug changes.