>>>>> "Alan" == Alan DeKok <[EMAIL PROTECTED]> writes:
Alan> Sam Hartman wrote: The whole composed / decomposed thing is Alan> a nightmare for passwords. >> And one the emu working group needs to deal with. Alan> RFC 3629 says that overlong sequences are invalid: Alan> Implementations of the decoding algorithm above MUST Alan> protect against decoding invalid sequences. For instance, a Alan> naive implementation may decode the overlong UTF-8 sequence Alan> C0 80 ... Alan> I would therefore follow the lead of the UTF-8 experts, Alan> and suggest that decomposed characters are "overlong", and Alan> thus invalid for the purposes of EMU. Therefore, UTF-8 Alan> sequences must consist solely of composed characters. This will be my last message on the subject. There are approaches to consider in this space, but it's not that simple. If only it were that simple. One of the simplest problems with this approach is that not all the combined characters that you need actually exist. Another potential issue is that I think normalizing towards decomposed may be easier than normalizing towards combined. --Sam _______________________________________________ Emu mailing list Emu@ietf.org https://www1.ietf.org/mailman/listinfo/emu