Bruno Haible writes:
> Simon Josefsson wrote:
>> I'm calculating this IDNA2008 property
>>
>>toNFKC(toCaseFold(toNFKC(cp))) != cp
>>
>> for all code points.
>
> It makes no sense to consider non-character code points here. Citing again
> the Unicode standard, chapter 3 [1], section 3.8:
>
>
Simon Josefsson wrote:
> I'm calculating this IDNA2008 property
>
>toNFKC(toCaseFold(toNFKC(cp))) != cp
>
> for all code points.
It makes no sense to consider non-character code points here. Citing again
the Unicode standard, chapter 3 [1], section 3.8:
"High-surrogate and low-surrogate c
FWIW, I came up with a better approach to handle this, and have asked
for confirmation of the interpretation on the IDNABIS list. So I think
u32_normalize is fine, as you explained.
http://www.alvestrand.no/pipermail/idna-update/2011-May/007099.html
/Simon
Bruno Haible writes:
> Simon Josefsson wrote:
>> I'm doing some Unicode NFKC operations and noticing that u32_normalize
>> fails for U+D800.
>
> This is a valid behaviour, because U+D800 is a "surrogate" point code
> and therefore not a valid character code point.
>
> See the Unicode standard, ch
Simon Josefsson wrote:
> I'm doing some Unicode NFKC operations and noticing that u32_normalize
> fails for U+D800.
This is a valid behaviour, because U+D800 is a "surrogate" point code
and therefore not a valid character code point.
See the Unicode standard, chapter 2 [1], pages 23..24:
Surrogat
I'm doing some Unicode NFKC operations and noticing that u32_normalize
fails for U+D800. Is this behaviour permitted by TR15? I thought
toNFKC should succeed for all code points.
/Simon