DM Smith wrote: > Java has ICU4J. So I imagine it is in there. Auto-detection is fine, > but I would think that it is expensive to do in the engine when > rendering text. It seems that it would be best to have properly tagged > xml. Using auto-detection to help with re-tagging would be better, IMHO.
Yes, very expensive I imagine. One thing that would be possible would be to throw in a tag to indicate the borders between scripts. For example: <milestone script="Latn"/>LATIN LATIN LATIN <milestone script="Hebr"/>HEBREW HEBREW <milestone script="Latn"/>LATIN LATIN We could write a little utility to process modules and spit out a text with script identifying milestones. > I have some questions: > 1) Would it be of any value to SWORD to do something with this? > I am imagining a conf change Script=, where the content would be a > comma separated list of scripts, with the most dominant being first. > E.g. A Strong's Greek and Hebrew dictionary with actual Greek and > Hebrew would have: > Script=Latin,Grek,Hebr I'm not sure how that would be useful. Can you think of how it would be used? Maybe use the complete list form all modules to determine which scripts to present to the user for assignment (similar to what we do with the GlobalOptionFilters). > 2) How would a SWORD application use script to allow the user to > assign a font to the script? > This is a usability question of how a user would know which script > goes with which language with which font? > In looking at the list, it was not obvious that English, French, > Norwegian, .... use Latin. A good question. I guess we could just give examples. There's usually no difference. Latin is the big exception. I think most Russian speakers know they're using Cyrillic. Greek is in Greek, Hebrew is in Hebrew, Coptic is in Coptic, Arabic is in Arabic, Korean is in Korean, etc. Chinese, Traditional and Simplified, are pretty well known by their users. Japanese is complex because they use 3 scripts, but Japanese speakers are aware of them all by name. Another thing to bear in mind is that virtually every font has basic Latin support. A font intended for displaying Burmese, will still have Latin characters. On the other hand, really good Latin coverage is hard to find. Even fonts intended for Latin use may not have sufficient coverage to display something like Maltese. --Chris _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page