On Thu, 25 Jul 2013 14:36:25 +0100, Jeremy Sanders wrote: > wxjmfa...@gmail.com wrote: > >> Short example. Writing an editor with something like the FSR is simply >> impossible (properly). > > http://www.gnu.org/software/emacs/manual/html_node/elisp/Text- Representations.html#Text-Representations > > "To conserve memory, Emacs does not hold fixed-length 22-bit numbers > that are codepoints of text characters within buffers and strings. > Rather, Emacs uses a variable-length internal representation of > characters, that stores each character as a sequence of 1 to 5 8-bit > bytes, depending on the magnitude of its codepoint[1]. For example, any > ASCII character takes up only 1 byte, a Latin-1 character takes up 2 > bytes, etc. We call this representation of text multibyte.
Well, you've just proven what Vim users have always suspected: Emacs doesn't really exist. > [1] This internal representation is based on one of the encodings > defined by the Unicode Standard, called UTF-8, for representing any > Unicode codepoint, but Emacs extends UTF-8 to represent the additional > codepoints it uses for raw 8- bit bytes and characters not unified with > Unicode. > " Do you know what those characters not unified with Unicode are? Is there a list somewhere? I've read all of the pages from here to no avail: http://www.gnu.org/software/emacs/manual/html_node/elisp/Non_002dASCII-Characters.html -- Steven -- http://mail.python.org/mailman/listinfo/python-list