I've added a -n flag to osis2mod that will normalize UTF-8 to NFC, which we've agreed as the standard for UTF-8 modules.
I used Sword's UTF8NFC filter to do the work, but found that it was buggy with trailing garbage on some verses. I have created a patch for both at www.crosswire.org/~dmsmith/nfcPatch.txt and would greatly appreciate some more testing of it. My test was fairly trivial. I took an OSIS file with limited UTF-8, already nfc and ran it through osis2mod with and without the -n flag and then compared the two files. Before I fixed UTF8NFC there were differences. After fixing UTF8NFC, there were none. All that this shows is that it does not corrupt an already good nfc utf-8 file. Many thanks in advance. DM _______________________________________________ sword-devel mailing list: [email protected] http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
