Hi, I'm trying to respond to every question, but I'm having a hard time keeping up :-)
Thanks a lot for all the precious input about shaping! Here's my suggestion, for version 0.2 of the recommendation: - No longer encourage any use of presentation form characters. - State that it's the terminal emulator's task to perform shaping, both in implicit and explicit modes. - Leave it for a future enhancement to handle trickier cases in explicit mode, such as shaping of a word that's only partially visible, or prevent shaping when two words happen to touch each other and are visually separated by other means (e.g. background color). Leave it for further research whether we could use ZWJ/ZWNJ here, whether we could use ECMA's SAPV 5-8 & 21-11, or whether we should invent something new (perhaps even telling the terminal emulator what neighboring previous/next characters to imagine there for the purpose of shaping)... Let me know if you have any remaining problems/concerns/etc. As for the implementation in VTE: initially I'll still use presentation form characters, solely because that's a low hanging fruit approach (low investment, high gain). I've already implemented it in about an hour (a bit of further hacks will be necessary to extend it to explicit mode, but still easily doable), whereas switching to HarfBuzz is expected to take weeks of heavy work. We'll tackle that in a subsequent version. And if anyone's happy to help, there's already some bounty for harfbuzz support :) Thanks again for the great guidance! cheers, egmont On Tue, Jan 29, 2019 at 1:50 PM Egmont Koblinger <egm...@gmail.com> wrote: > > Hi, > > Terminal emulators are a powerful tool used by many people for various > tasks. Most terminal emulators' bugtracker has a request to add RTL / > BiDi support. Unicode has supported BiDi for about 20 years now. > Still, the intersection of these two fields isn't solved. Even some > Unicode experts have stated over time that no one knows how to do it > properly. > > The only documentation I could find (ECMA TR/53) predates the Unicode > BiDi algorithm, and as such no surprise that it doesn't follow the > current state of the art or best practices. > > Some terminal emulators decided to run the BiDi algorithm for display > purposes on its lines (rather than paragraphs, uh), not seeing the big > picture that such a behavior turns them into a platform on top of > which it's literally impossible to implement proper BiDi-aware text > editing (vim, emacs, whatever) experience. In turn, vim, emacs and > friends stand there clueless, not knowing how to do BiDi in terminals. > > With about 5 years of experience in terminal emulator development, and > some prior BiDi homepage developing experience with the kind mentoring > of one of the BiDi gurus (Aharon, if you're reading this, hi there!), > I decided to tackle this issue. I studied and evaluated the > aforementioned documentation and the behavior of such terminals, > pointed out the problems, and came up with a draft proposal. > > My work isn't complete yet. One of the most important pending issues > is to figure out how to track BiDi control characters (e.g. which > character cells they belong to), it is to be addressed in a subsequent > version. But I sincerely hope I managed to get the basics right and > clean enough so that work can begin on implementing proper support in > terminal emulators as well as fullscreen text applications; and as we > gain experience and feedback, extending the spec to address the > missing bits too. > > You can find this (draft) specification at [1]. Feedback is welcome – > if it's an actionable one then preferably over there in the project's > bugtracker. > > [1] https://terminal-wg.pages.freedesktop.org/bidi/ > > > cheers, > egmont (GNOME Terminal / VTE co-developer)