On Wed, 19 Mar 2014 14:17:00 +0100 Daniel Bünzli <[email protected]> wrote:
> Le mercredi, 19 mars 2014 à 02:33, Doug Ewell a écrit : > > There are two types of people: > > > > 1. those who fully expect Backspace to erase a single keystroke, > > and feel it is a fatal flaw if it erases an entire combination, and > > > > 2. those who fully expect Backspace to erase an entire combination, > > and feel it is a fatal flaw if it erases just a single keystroke. > > > > Unfortunately, both types exist in significant numbers. And I belong to a third group - I expect it to delete a Unicode character. > Isn't it possible to classify appartenance to 1 or 2 according to > script ? E.g. I suspect most french speaking person when backspacing > an É would like to erase the whole combination; for é it seems even > more obvious since usually it's introduced with a single keystroke. It's not as simple as script. For an English speaker who enters it on a keyboard, it's normally entered with multiple keystrokes, most typically via a dead key. Now, if I type it in using an out of order sequence such as 'e, it is quite reasonable for it to be stored as a single composed character and deleted by backspace. On the other hand, if I type it in using an XSAMPA-based keyboard sequence such as e_H, I expect the backspace to delete just the accent, just as I am used to for the sequence O_H which yields 2 characters, open o with acute (ɔ́). The diacritic here would not not arbitrary - I would be using it to indicate a specific tone. (It came as a nasty shock to find my e-mail client, Claws on Ubuntu, takes out the entire cluster. For Thai legacy grapheme clusters, it just takes out the last character entered.) At the moment I have made my life more difficult for myself by devising a keyboard that generates NFC if the key strokes are in the right order. As a reasonable guide, backspace should not take out more than one NFC character, and I would defend this even for Cyrillic-script tone marking in Serbian. Now, there's supposed to be an interface definition for using incremental keyboard typing as in Keyman, where keyboards can be arranges so that one sees what's been typed in already. Where is it? It is rather important for an application to know when it can normalise input characters. For example, LibreOffice helpfully swaps round a tone Thai mark with a following vowel mark below, with the slightly bizarre consequence that the sequence ko kai, mai ek, sara u, backspace yields <KO KAI, SARA U>. Traditionally, the sequence yields a beep and just <KO KAI> - the input handler rejects the SARA U because it does not accord with the character order prescribed by WTT (wing thuk thi). Richard. _______________________________________________ Unicode mailing list [email protected] http://unicode.org/mailman/listinfo/unicode

