Re: [fpc-devel] Unicode support (again)

peter green Tue, 11 Nov 2008 03:51:26 -0800

Michael Schnell wrote:

It will at best be "friendly old school behaviour which works most ofthe time, but which fails as soon as the strings are not completelynormalised because then you can have decomposed characters andwhatnot" (which in turn easily leads to security holes due toincomplete checks, hard to reproduce bugs and "write once, debugeverywhere"-style behaviour).
Sorry, I don't understand. What not normalized behavior needs to betaken into account ?

Remember that an individual code point does not nessacerally representwhat a user would consider a character. Indeed one character may berepresentable in more than one way (either as a precomposed character ora sequence of base character and combining diacritic). And even if weignore combining diacritics the number of console positions a stringtakes is not nessacerally equal to the code point either since many CJKcharacters take two console positions.

Given theese facts code point counts and indexes are not much moreusefull than code unit indexes and counts.

And if you need something better than either code point count or codeunit count then you have little choice but to pull in an externallibrary. Pulling in an external library with a relatively unstableinterface is not something the compiler or RTL should be doing IMO.


_______________________________________________
fpc-devel maillist  -  [email protected]
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Unicode support (again)

Reply via email to