Hi,
this very simple diff provides partial, naive UTF-8 support for
word handling in ksh(1) emacs mode.
It improves all functions involving words (forward-word, backward-word,
delete-word-forward, delete-word-backward, downcase-word, upcase-word,
capitalize-word) by allowing non-ASCII characters to be part of words.
This is not perfect: all non-ASCII characters become part of the
adjacent words, and the case of non-ASCII characters cannot be changed.
But it improves things a bit in a very non-intrusive way.
This is the final patch i'd like to commit to ksh/emacs.c for 5.9.
It is too early for adding support for double-width and zero-with
characters, and we are too close to release for that, anyway.
OK?
Ingo
Index: emacs.c
===================================================================
RCS file: /cvs/src/bin/ksh/emacs.c,v
retrieving revision 1.61
diff -u -p -r1.61 emacs.c
--- emacs.c 10 Dec 2015 10:00:14 -0000 1.61
+++ emacs.c 5 Jan 2016 18:35:54 -0000
@@ -49,7 +49,8 @@ struct x_ftab {
#define is_cfs(c) (c == ' ' || c == '\t' || c == '"' || c == '\'')
/* Separator for motion */
-#define is_mfs(c) (!(isalnum((unsigned char)c) || c == '_' || c
== '$'))
+#define is_mfs(c) (!(isalnum((unsigned char)c) || \
+ c == '_' || c == '$' || c & 0x80))
/* Arguments for do_complete()
* 0 = enumerate M-= complete as much as possible and then list