In developing the MapM, we came across the normalization problem. There is a Wikimedia bug, or perhaps a bug in the Unicode normalization algorithm, described here:
https://bugzilla.wikimedia.org/show_bug.cgi?id=2399
This causes the Hebrew text of the original MapM on Wikimedia to have its characters reordered from the standard Hebrew ordering. In the OSIS text of the MapM, I fixed the ordering using regular expressions. So the text conforms to Hebrew: consonant-shin/sin dot-dagesh, etc.

David

On 6/25/2014 8:53 AM, David Haslam wrote:
My observations arise in connection with Hebrew Unicode text.

I do know why NFC is default, and why it's recommended.

The Hebrew MapM module is not NFC normalized, so there must have been a
genuine reason why the -N option was used during its build. Another Hebrew
module (from IBT) is also not normalized.

Likewise, an earlier version of the Hebrew WLC module was rebuilt without
NFC, albeit the current release is normalized. Refer to the file wlc.conf
for the history.

This suggests that the -N option can be made to work, but perhaps it has
only ever been tested under Linux?  As a Windows user, I am curious as to
why I could not get it to work at all.

Though I can't go into any details, my OSIS XML source text is already
UTF-8, and is valid to the OSIS schema.

I am still curious as to why there was a historic reversion of normalization
for the WLC module.
cf. I asked Chris, but he never responded, though I guess he's too busy this
year.

Best regards,

David









--
View this message in context: 
http://sword-dev.350566.n4.nabble.com/Using-the-N-option-in-osis2mod-tp4653983p4654013.html
Sent from the SWORD Dev mailing list archive at Nabble.com.

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com


_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to