Hi Simon, > There is no hurry, I'm mostly curious about what kind of > non-trivial changes there would be. I know that between 5.2 and 6.0 > there were some changes made that would affect a IDNA2008 implementation > for example. > > The best would be if the process to re-generate the files were > documented, then I could generate them on the fly to test my code with a > 5.1, 5.2 and 6.0 Unicode library, which would be useful for > compatibility and regression testing. > > If there were significant changes in any _algorithm_, as opposed to data > tables, that would be interesting to know. I recall the NFKC algorithm > changed slightly between Unicode 3.2.0 and the next version but > hopefully 5.1/5.2/6.0 doesn't see any changes like that any more.
I have now updated the Unicode related modules to Unicode 5.2.0. The process involves more than just regenerating data files. It also requires to update some functions in gen-uni-tables.c to match the updated Unicode Standard Annexes. There was some significant change in the line breaking algorithm: In strings like "x(y" no more line break is possible before the opening parenthesis. There were also visible changes in the width determination. Other than that, I didn't see algorithm changes in Unicode TR#11, TR#14, TR#15, TR#24, TR#29. > Btw, I'm (finally) working on a IDNA2008 implementation, and it is using > your libunistring. How will this work with the glibc add-on? Will it incorporate some parts of libunistring literally, or will it load libunistring dynamically? Bruno