Re: uninorm/nfc - Unicode version?

Bruno Haible Sun, 09 Jan 2011 02:46:47 -0800

Hi Simon,

> There is no hurry, I'm mostly curious about what kind of
> non-trivial changes there would be.  I know that between 5.2 and 6.0
> there were some changes made that would affect a IDNA2008 implementation
> for example.
> 
> The best would be if the process to re-generate the files were
> documented, then I could generate them on the fly to test my code with a
> 5.1, 5.2 and 6.0 Unicode library, which would be useful for
> compatibility and regression testing.
> 
> If there were significant changes in any _algorithm_, as opposed to data
> tables, that would be interesting to know.  I recall the NFKC algorithm
> changed slightly between Unicode 3.2.0 and the next version but
> hopefully 5.1/5.2/6.0 doesn't see any changes like that any more.


I have now updated the Unicode related modules to Unicode 5.2.0.
The process involves more than just regenerating data files. It also
requires to update some functions in gen-uni-tables.c to match the updated
Unicode Standard Annexes.

There was some significant change in the line breaking algorithm: In
strings like "x(y" no more line break is possible before the opening
parenthesis. There were also visible changes in the width determination.

Other than that, I didn't see algorithm changes in Unicode TR#11, TR#14,
TR#15, TR#24, TR#29.

> Btw, I'm (finally) working on a IDNA2008 implementation, and it is using
> your libunistring.

How will this work with the glibc add-on? Will it incorporate some parts
of libunistring literally, or will it load libunistring dynamically?

Bruno

Re: uninorm/nfc - Unicode version?

Reply via email to