Bruno Haible <br...@clisp.org> writes: > Hi Simon, > >> There is no hurry, I'm mostly curious about what kind of >> non-trivial changes there would be. I know that between 5.2 and 6.0 >> there were some changes made that would affect a IDNA2008 implementation >> for example. >> >> The best would be if the process to re-generate the files were >> documented, then I could generate them on the fly to test my code with a >> 5.1, 5.2 and 6.0 Unicode library, which would be useful for >> compatibility and regression testing. >> >> If there were significant changes in any _algorithm_, as opposed to data >> tables, that would be interesting to know. I recall the NFKC algorithm >> changed slightly between Unicode 3.2.0 and the next version but >> hopefully 5.1/5.2/6.0 doesn't see any changes like that any more. > > I have now updated the Unicode related modules to Unicode 5.2.0. > The process involves more than just regenerating data files. It also > requires to update some functions in gen-uni-tables.c to match the updated > Unicode Standard Annexes.
Thank you very much! Unicode 5.2.0 is better than 6.0.0 for me, since IDNA2008 reference 5.2.0 normatively in some aspects. Once I have established a good set of self tests, I will run them both against libunistring for 5.0.0 and 5.2.0 to see if I can find any string that behaves differently. >> Btw, I'm (finally) working on a IDNA2008 implementation, and it is using >> your libunistring. > > How will this work with the glibc add-on? Will it incorporate some parts > of libunistring literally, or will it load libunistring dynamically? I have no idea yet. Right now, libidna doesn't even link to libunistring dynamically because I want to make sure I get the "right" libunistring implementation. Given the complexities in IDNA2008 I am wondering whether it might not make more sense to let glibc ask a system daemon to do the string conversion rather than to do everything in glibc. There is still a lot of work being done on various pre- and post- IDNA2008 mappings because IDNA2008 by itself is neither backwards/forwards compatible or safe to use. This may be something you want to configure on a per-system basis. /Simon