All the Keyman products -- on Windows, web, iOS and Android, as well as KMFL, 
which is a port of Keyman, work on the principle of modifying the text buffer 
directly.  There is no intermediate compose buffer.  For Indic and western 
scripts this works pretty well; the compose buffer which is a feature of IMEs 
does not fit these scripts cleanly in my experience.  It is often hard to know 
when a text entry is 'complete' for committing the compose buffer, and one 
effect is that the compose buffer tends to get very long, which makes 
accidental cancellation of input a common and frustrating issue.

The most obvious backspace intelligence I've seen in use is around handling NFC 
vs NFD text.  It is confusing to the end user if backspace sometimes deletes a 
whole character + diacritic, and sometimes just the diacritic mark.  For 
example, Vietnamese text has suffered from this issue with the varying 
composition schemes we've seen enforced by limited input methods.

-----Original Message-----
From: Unicode [mailto:[email protected]] On Behalf Of Richard 
Wordingham
Sent: Monday, 24 March 2014 12:07 AM
To: [email protected]
Subject: Re: Editing Sinhala and Similar Scripts

On Sun, 23 Mar 2014 03:32:06 +0100
Philippe Verdy <[email protected]> wrote:

> This is wrong, the IME or keyboard driver handles the state of 
> keystrokes, even if you use a COMPOSE key or a DEAD KEY, this does not 
> matter, and so it won't feed the encoded text with streams of 
> characters as long as the state is not complete enough:

This is certainly not true of Keyman for Linux (KMFL), and I don't believe it 
is true of Tavultesoft Keyman for Windows either.  This does require that the 
input method have a way of cancelling previously provided input.  Now, if you 
use a method with a COMPOSE key or a DEAD key, you are generally unlikely to 
get tentative entries.
However, one could write an input method that simulated a dead key but actually 
generated an output for it so as to imitate a typewriter differently.

> The effect of Backspace entered just after it would delete 
> simulatenously CGJ and the diacritic characters. It does not need to 
> depend on the input state of the driver or the IME. In all cases, 
> nothing in the keyboard mapping or IME will generate a CGJ character 
> isolately, ir will be always followed by something.

If backspace is not modified by the input method - and Marc Durdin has 
suggested that the input method should sometimes modify it - its effect will 
depend on the process controlling the backing store, which in general will work 
with multiple input methods, even during the course of a single editing 
session.  You might not write an input method that generates a single CGJ, but 
I do.  Do you insist on a soft hyphen when writing 'Llangollen' so that it will 
collate after 'Llanberis' in Welsh?  (I typed the place names in English; the 
names are spelt the same way in English and Welsh in hardcopy, though of course 
the letter counts differ.)

_______________________________________________
Unicode mailing list
[email protected]
http://unicode.org/mailman/listinfo/unicode

Reply via email to