SWORD too. I don’t yet see a value in the suggested conf entry.
— DM Smith From my phone. Brief. Weird autocorrections. > On Jan 7, 2018, at 4:03 AM, David Haslam <dfh...@protonmail.com> wrote: > > You mean in the JSword API ? > > If so, that a start. Thanks, DM. :) > > Does that mean you now support the proposed new config key being accepted and > documented? > > Best regards, > > David > > Sent from ProtonMail Mobile > > >> On Sat, Jan 6, 2018 at 23:43, DM Smith <dmsm...@crosswire.org> wrote: >> I added -N. To make search work. >> >> — DM Smith >> From my phone. Brief. Weird autocorrections. >> >> On Jan 6, 2018, at 4:41 PM, David Haslam <dfh...@protonmail.com> wrote: >> >>> Thanks DM. >>> >>> Interesting observations. >>> >>> It prompts the question whether either engine includes the capability to >>> normalize the search index (assuming that it does normalize the search key). >>> And that it does this by default ???? >>> Or does indexing assume that all modules were made without using the -N >>> option and are therefore already in NFC. >>> Yet it also remains the case that some front-ends also provide for >>> non-indexed search options. >>> >>> Moreover, it raise questions as to how the front-end actually displays the >>> set of search results when all or part of the underlying module is not NFC. >>> >>> It must be the case that the developers of osis2mod had a valid reason to >>> provide the -N option. >>> Are those involved back then still with CrossWire? >>> >>> Best regards, >>> >>> David >>> >>> >>> Sent from ProtonMail Mobile >>> >>> >>>> On Sat, Jan 6, 2018 at 21:20, DM Smith <dmsm...@crosswire.org> wrote: >>>> The purpose of normalization was for the sake of search. Only when the >>>> search index and the search request are normalized to the same form can a >>>> result be found. >>>> >>>> It doesn’t matter if the normalized form is not readable. If SWORD (or >>>> JSword) normalizes both the same, then it doesn’t matter what Unicode >>>> Normalization or lack of it is used for displaying the text. >>>> >>>> Assuming that SWORD (or JSword) handles search properly, the only >>>> advantage of canonical over decomposed in the module itself is space. >>>> >>>> In Him, >>>> DM >>>> >>>>> On Jan 6, 2018, at 2:26 PM, David Haslam <dfh...@protonmail.com> wrote: >>>>> >>>>> Good question, Tom. >>>>> >>>>> Assuming that the Latin script part of the source text actually required >>>>> normalization to NFC, >>>>> and that at least some of the Biblical Hebrew should not be converted to >>>>> NFC, >>>>> you'd build the module using the -N switch of osis2mod, after first >>>>> applying a script >>>>> to the source text to ensure that both the requirements were implemented. >>>>> >>>>> It would be a very simple task for a bespoke TextPipe filter with a >>>>> restrict filter >>>>> designed to limit the Convert to NFC subfilter to the text that was not >>>>> Hebrew. >>>>> >>>>> Ignoring alphabetical presentation forms, all the Hebrew characters are >>>>> in one Unicode block. >>>>> A PCRE to exclude the Hebrew would be very simple. >>>>> I could almost do it in my sleep after 17 years using TextPipe. >>>>> No doubt other programmers could do likewise with Perl or Python, etc. >>>>> >>>>> Best regards, >>>>> >>>>> David >>>>> >>>>> Sent from ProtonMail Mobile >>>>> >>>>> >>>>>> On Sat, Jan 6, 2018 at 19:14, Tom Sullivan <i...@beforgiven.info> wrote: >>>>>> Y'all: For text, such as in a commentary, which includes both Hebrew and >>>>>> English (or another modern Latin script using language), what do you put >>>>>> for the normalization? Tom Tom Sullivan i...@beforgiven.info FAX: >>>>>> 815-301-2835 --------------------- Great News! God created you, owns you >>>>>> and gave you commands to obey. You have disobeyed God - as your >>>>>> conscience very well attests to you. God's holiness and justice compel >>>>>> Him to punish you in Hell. Jesus Christ became Man, was crucified, >>>>>> buried and rose from the dead as a substitute for all who trust in Him, >>>>>> redeeming them from Hell. If you repent (turn from your sin) and believe >>>>>> (trust) in Jesus Christ, you will go to Heaven. Otherwise you will go to >>>>>> Hell. Warning! Good works are a result, not cause, of saving trust. More >>>>>> info is at www.esig.beforgiven.infoDo you believe this? Copy this >>>>>> signature into your email program and use the Internet to spread the >>>>>> Great News every time you email. On 01/06/2018 12:32 PM, David Haslam >>>>>> wrote: > Hi Greg, > > One area where it might turn out to be useful is >>>>>> for the search features > of front-end apps. > It could be important to >>>>>> know that the underlying module text is _not_ > *NFC*. > > That's not to >>>>>> lay down a requirement as to how search features should be > designed, > >>>>>> but at least to provide the information in case it does matter for some >>>>>> > types of search option. > > Like other things in .conf files, a key >>>>>> can also be _educational_. > It may prompt developers and users to ask, >>>>>> /*Why did they do this?*/ > > cf. It was _almost by accident_ that in >>>>>> 2014, I first came across this > aspect of using Unicode for Biblical >>>>>> Hebrew. > /It applies only to texts with _both_ vowel accents and >>>>>> cantillation./ > > Even though it's mentioned in our developers' wiki, >>>>>> it's all too easily > missed by other CrossWire volunteers. > > Best >>>>>> regards, > > David > > Sent with ProtonMail Secure Email. > >> -------- >>>>>> Original Message -------- >> Subject: Re: [sword-devel] Module .conf >>>>>> files, Unicode Normalization >> Local Time: 6 January 2018 5:19 PM >> >>>>>> UTC Time: 6 January 2018 17:19 >> From: greg.helli...@gmail.com >> To: >>>>>> David Haslam , SWORD Developers' >> Collaboration Forum >> >> Why would >>>>>> the front end or engine need to know this information? Would >> it help >>>>>> the front end developers or users to know it? What do we gain >> by >>>>>> adding this? (I'm not implying it wouldn't be beneficial. But the >> >>>>>> only thing I know about Unicode is how the different UTF encodings >> >>>>>> work, so I have no idea what use this information could be. I also >> >>>>>> think changes to formats and information standards should be >> >>>>>> conservative instead of liberal) >> >> --Greg >> >> On Jan 6, 2018 >>>>>> 11:01, "David Haslam" > > wrote: >> >> Dear all, >> >> We've known for >>>>>> quite a few years that there are aspects of >> *Biblical Hebrew* that >>>>>> mean we should _avoid_ converting the >> Unicode source text to *NFC* >>>>>> when we build a module. >> >> This prompts me to suggest that we ought >>>>>> to define a new *key* for >> .conf files. >> >> *Normalization=NFC* >>>>>> (this would be the default, and may be >> _omitted_ for the vast >>>>>> majority of modules) >> *Normalization=Custom* (we should include this >>>>>> in certain Biblical >> Hebrew modules) >> >> This would make it clear to >>>>>> front-end developers and users alike >> that the source text was _not_ >>>>>> converted to NFC during module build. >> i.e. *osis2mod* was used >>>>>> intentionally with the *-N* switch, in >> _accordance with the >>>>>> requirements of the source text provider_. >> >> The Unicode source text >>>>>> may already be encoded in *UTF-8* ; this >> memo is /only /about >>>>>> normalization. >> >> In the rare eventuality that there could arise a >>>>>> requrement for >> any of the other three normalization forms (*NFD*, >>>>>> *NFKC*, *NFKD*) >> defined by the Unicode Consortium, >> these would >>>>>> also be permitted values for the conf file key. >> >> A further benefit >>>>>> arises when a module needs to be updated. >> If the modules team sees >>>>>> that the .conf file includes the line >> *Normalization=Custom* >> they >>>>>> would be forewarned against converting to NFC through >> /inadvertently/ >>>>>> omitting the *-N* switch during module build. >> >> _Aside_: Another >>>>>> language with a need for non-standard >> normalization is *Tibetan*. We >>>>>> don't yet have a module in that script. >> >> Best regards, >> >> David >>>>>> >> >> Sent with ProtonMail Secure Email. >> >> >> >>>>>> _______________________________________________ >> sword-devel mailing >>>>>> list: sword-devel@crosswire.org >> >> >>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel >> >> Instructions >>>>>> to unsubscribe/change your settings at above page > > > >>>>>> ______________________________________________________________________ > >>>>>> This email has been scanned by the Symantec Email Security.cloud >>>>>> service. > For more information please visit >>>>>> http://www.symanteccloud.com > >>>>>> ______________________________________________________________________ > >>>>>> > > _______________________________________________ > sword-devel >>>>>> mailing list: sword-devel@crosswire.org> >>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to >>>>>> unsubscribe/change your settings at above page > >>>>>> _______________________________________________ sword-devel mailing >>>>>> list: sword-devel@crosswire.org >>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to >>>>>> unsubscribe/change your settings at above page >>>>> _______________________________________________ >>>>> sword-devel mailing list: sword-devel@crosswire.org >>>>> http://www.crosswire.org/mailman/listinfo/sword-devel >>>>> Instructions to unsubscribe/change your settings at above page >>>> >>> _______________________________________________ >>> sword-devel mailing list: sword-devel@crosswire.org >>> http://www.crosswire.org/mailman/listinfo/sword-devel >>> Instructions to unsubscribe/change your settings at above page > _______________________________________________ > sword-devel mailing list: sword-devel@crosswire.org > http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page
_______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page