You mean in the JSword API ?
If so, that a start. Thanks, DM. :)
Does that mean you now support the proposed new config key being accepted and
documented?
Best regards,
David
Sent from ProtonMail Mobile
On Sat, Jan 6, 2018 at 23:43, DM Smith <dmsm...@crosswire.org> wrote:
> I added -N. To make search work.
>
> — DM Smith
> From my phone. Brief. Weird autocorrections.
>
> On Jan 6, 2018, at 4:41 PM, David Haslam <dfh...@protonmail.com> wrote:
>
>> Thanks DM.
>>
>> Interesting observations.
>>
>> It prompts the question whether either engine includes the capability to
>> normalize the search index (assuming that it does normalize the search key).
>> And that it does this by default ????
>> Or does indexing assume that all modules were made without using the -N
>> option and are therefore already in NFC.
>> Yet it also remains the case that some front-ends also provide for
>> non-indexed search options.
>>
>> Moreover, it raise questions as to how the front-end actually displays the
>> set of search results when all or part of the underlying module is not NFC.
>>
>> It must be the case that the developers of osis2mod had a valid reason to
>> provide the -N option.
>> Are those involved back then still with CrossWire?
>>
>> Best regards,
>>
>> David
>>
>> Sent from ProtonMail Mobile
>>
>> On Sat, Jan 6, 2018 at 21:20, DM Smith <dmsm...@crosswire.org> wrote:
>>
>>> The purpose of normalization was for the sake of search. Only when the
>>> search index and the search request are normalized to the same form can a
>>> result be found.
>>>
>>> It doesn’t matter if the normalized form is not readable. If SWORD (or
>>> JSword) normalizes both the same, then it doesn’t matter what Unicode
>>> Normalization or lack of it is used for displaying the text.
>>>
>>> Assuming that SWORD (or JSword) handles search properly, the only advantage
>>> of canonical over decomposed in the module itself is space.
>>>
>>> In Him,
>>> DM
>>>
>>>> On Jan 6, 2018, at 2:26 PM, David Haslam <dfh...@protonmail.com> wrote:
>>>>
>>>> Good question, Tom.
>>>>
>>>> Assuming that the Latin script part of the source text actually required
>>>> normalization to NFC,
>>>> and that at least some of the Biblical Hebrew should not be converted to
>>>> NFC,
>>>> you'd build the module using the -N switch of osis2mod, after first
>>>> applying a script
>>>> to the source text to ensure that both the requirements were implemented.
>>>>
>>>> It would be a very simple task for a bespoke TextPipe filter with a
>>>> restrict filter
>>>> designed to limit the Convert to NFC subfilter to the text that was not
>>>> Hebrew.
>>>>
>>>> Ignoring alphabetical presentation forms, all the Hebrew characters are in
>>>> one Unicode block.
>>>> A PCRE to exclude the Hebrew would be very simple.
>>>> I could almost do it in my sleep after 17 years using TextPipe.
>>>> No doubt other programmers could do likewise with Perl or Python, etc.
>>>>
>>>> Best regards,
>>>>
>>>> David
>>>>
>>>> Sent from ProtonMail Mobile
>>>>
>>>> On Sat, Jan 6, 2018 at 19:14, Tom Sullivan <i...@beforgiven.info> wrote:
>>>>
>>>>> Y'all: For text, such as in a commentary, which includes both Hebrew and
>>>>> English (or another modern Latin script using language), what do you put
>>>>> for the normalization? Tom Tom Sullivan i...@beforgiven.info FAX:
>>>>> 815-301-2835 --------------------- Great News! God created you, owns you
>>>>> and gave you commands to obey. You have disobeyed God - as your
>>>>> conscience very well attests to you. God's holiness and justice compel
>>>>> Him to punish you in Hell. Jesus Christ became Man, was crucified, buried
>>>>> and rose from the dead as a substitute for all who trust in Him,
>>>>> redeeming them from Hell. If you repent (turn from your sin) and believe
>>>>> (trust) in Jesus Christ, you will go to Heaven. Otherwise you will go to
>>>>> Hell. Warning! Good works are a result, not cause, of saving trust. More
>>>>> info is at www.esig.beforgiven.infoDo you believe this? Copy this
>>>>> signature into your email program and use the Internet to spread the
>>>>> Great News every time you email. On 01/06/2018 12:32 PM, David Haslam
>>>>> wrote: > Hi Greg, > > One area where it might turn out to be useful is
>>>>> for the search features > of front-end apps. > It could be important to
>>>>> know that the underlying module text is _not_ > *NFC*. > > That's not to
>>>>> lay down a requirement as to how search features should be > designed, >
>>>>> but at least to provide the information in case it does matter for some >
>>>>> types of search option. > > Like other things in .conf files, a key can
>>>>> also be _educational_. > It may prompt developers and users to ask, /*Why
>>>>> did they do this?*/ > > cf. It was _almost by accident_ that in 2014, I
>>>>> first came across this > aspect of using Unicode for Biblical Hebrew. >
>>>>> /It applies only to texts with _both_ vowel accents and cantillation./ >
>>>>> > Even though it's mentioned in our developers' wiki, it's all too easily
>>>>> > missed by other CrossWire volunteers. > > Best regards, > > David > >
>>>>> Sent with ProtonMail Secure Email. > >> -------- Original Message
>>>>> -------- >> Subject: Re: [sword-devel] Module .conf files, Unicode
>>>>> Normalization >> Local Time: 6 January 2018 5:19 PM >> UTC Time: 6
>>>>> January 2018 17:19 >> From: greg.helli...@gmail.com >> To: David Haslam ,
>>>>> SWORD Developers' >> Collaboration Forum >> >> Why would the front end
>>>>> or engine need to know this information? Would >> it help the front end
>>>>> developers or users to know it? What do we gain >> by adding this? (I'm
>>>>> not implying it wouldn't be beneficial. But the >> only thing I know
>>>>> about Unicode is how the different UTF encodings >> work, so I have no
>>>>> idea what use this information could be. I also >> think changes to
>>>>> formats and information standards should be >> conservative instead of
>>>>> liberal) >> >> --Greg >> >> On Jan 6, 2018 11:01, "David Haslam" > >
>>>>> wrote: >> >> Dear all, >> >> We've known for quite a few years that there
>>>>> are aspects of >> *Biblical Hebrew* that mean we should _avoid_
>>>>> converting the >> Unicode source text to *NFC* when we build a module. >>
>>>>> >> This prompts me to suggest that we ought to define a new *key* for >>
>>>>> .conf files. >> >> *Normalization=NFC* (this would be the default, and
>>>>> may be >> _omitted_ for the vast majority of modules) >>
>>>>> *Normalization=Custom* (we should include this in certain Biblical >>
>>>>> Hebrew modules) >> >> This would make it clear to front-end developers
>>>>> and users alike >> that the source text was _not_ converted to NFC during
>>>>> module build. >> i.e. *osis2mod* was used intentionally with the *-N*
>>>>> switch, in >> _accordance with the requirements of the source text
>>>>> provider_. >> >> The Unicode source text may already be encoded in
>>>>> *UTF-8* ; this >> memo is /only /about normalization. >> >> In the rare
>>>>> eventuality that there could arise a requrement for >> any of the other
>>>>> three normalization forms (*NFD*, *NFKC*, *NFKD*) >> defined by the
>>>>> Unicode Consortium, >> these would also be permitted values for the conf
>>>>> file key. >> >> A further benefit arises when a module needs to be
>>>>> updated. >> If the modules team sees that the .conf file includes the
>>>>> line >> *Normalization=Custom* >> they would be forewarned against
>>>>> converting to NFC through >> /inadvertently/ omitting the *-N* switch
>>>>> during module build. >> >> _Aside_: Another language with a need for
>>>>> non-standard >> normalization is *Tibetan*. We don't yet have a module in
>>>>> that script. >> >> Best regards, >> >> David >> >> Sent with ProtonMail
>>>>> Secure Email. >> >> >> _______________________________________________ >>
>>>>> sword-devel mailing list: sword-devel@crosswire.org >> >>
>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel >> >> Instructions
>>>>> to unsubscribe/change your settings at above page > > >
>>>>> ______________________________________________________________________ >
>>>>> This email has been scanned by the Symantec Email Security.cloud service.
>>>>> > For more information please visit http://www.symanteccloud.com >
>>>>> ______________________________________________________________________ >
>>>>> > > _______________________________________________ > sword-devel mailing
>>>>> list: sword-devel@crosswire.org>
>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to
>>>>> unsubscribe/change your settings at above page >
>>>>> _______________________________________________ sword-devel mailing list:
>>>>> sword-devel@crosswire.org
>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to
>>>>> unsubscribe/change your settings at above page @crosswire.org>
>>>>> @protonmail.com> @protonmail.com> @crosswire.org> @protonmail.com>
>>>>
>>>> _______________________________________________
>>>> sword-devel mailing list: sword-devel@crosswire.org
>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>> Instructions to unsubscribe/change your settings at above page
>
>> _______________________________________________
>> sword-devel mailing list: sword-devel@crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page