Ryan Johnson wrote: >I set a breakpoint there, since I thought it was guaranteed to lead to a >crash if it ever ran, but it turns out that's not true. Invoking M-x >compile triggers the breakpoint twice in a row with the following >(valid!) 5-byte UTF-8: > >111110XX 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX >11111000 10001111 10111111 10111101 10111111 > >The value is always the same, and corresponds to the code point >U+3FFF7F, FWIW. The backtrace seems to involve loading a file (maybe the >.elc contains 'compile or 'compilation-mode?), and the breakpoint does >not recur in subsequent compilations, presumably because they don't >re-load the file. Emacs continues normally from there, because the >leading bits are zero and the resulting code point doesn't pass the >0x3FFFFF limit.
Modern Emacs uses an extended UTF-8 as internal representation. http://www.gnu.org/software/emacs/manual/html_node/elisp/Text-Representations.html -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple