>>>>> "Alan" == Alan DeKok <[EMAIL PROTECTED]> writes:

    Alan> Sam Hartman wrote: The whole composed / decomposed thing is
    Alan> a nightmare for passwords.
    >>  And one the emu working group needs to deal with.

    Alan>   RFC 3629 says that overlong sequences are invalid:

    Alan>    Implementations of the decoding algorithm above MUST
    Alan> protect against decoding invalid sequences.  For instance, a
    Alan> naive implementation may decode the overlong UTF-8 sequence
    Alan> C0 80 ...

    Alan>   I would therefore follow the lead of the UTF-8 experts,
    Alan> and suggest that decomposed characters are "overlong", and
    Alan> thus invalid for the purposes of EMU.  Therefore, UTF-8
    Alan> sequences must consist solely of composed characters.


This will be my last message on the subject.

There are approaches to consider in this space, but it's not that simple.

If only it were that simple.  One of the simplest problems with this
approach is that not all the combined characters that you need
actually exist.  Another potential issue is that I think normalizing
towards decomposed may be easier than normalizing towards combined.

--Sam



_______________________________________________
Emu mailing list
Emu@ietf.org
https://www1.ietf.org/mailman/listinfo/emu

Reply via email to