btw. The same 2 results were obtained by a search for "δι ημερων".

i.e. Without the U+2019 in the search key.

Best regards,

David

Sent with [Proton Mail](https://pr.tn/ref/SWXT9A5YZ67G) secure email.

On Monday, March 17th, 2025 at 6:46 PM, David Haslam <dfh...@protonmail.com> 
wrote:

> Hi DM,
>
> With Xiphos 4.3.1 (latest update) when I searched TischMorph either for "δι’ 
> ἡμερῶν", or for "δι’ ημερων", there were 2 results:
>
> - Mark 2:1
> - Acts 1:3
>
> Search results were no different with the Greek Accents on or off. I 
> therefore conclude that your hunch was incorrect!
>
> Aside:
>
> - After an exact phrase search, both results preview correctly.
> - After a Lucene fast search, both results preview really 
> [weirdly](https://www.dropbox.com/scl/fi/msw6s8dl4au5z0optwm5l/Screenshot-2025-03-17-18.43.04.png?rlkey=wps1isdrh9h1atdck6r7ihbol&dl=0)
>  & 
> [weirdly](https://www.dropbox.com/scl/fi/4aiyelopdy1a1gjlpto5f/Screenshot-2025-03-17-18.44.12.png?rlkey=bc1qmql18faoti9b6o6o27qeu&dl=0)
>  !!! I think this should be reported to Karl K. Might it be a software bug?
>
> Best regards,
>
> David
>
> Sent with [Proton Mail](https://pr.tn/ref/SWXT9A5YZ67G) secure email.
>
> On Monday, March 17th, 2025 at 6:17 PM, DM Smith <dmsm...@crosswire.org> 
> wrote:
>
>> David,
>> I’m not sure that the filter is only used for display. I think it may also 
>> be used for search. In Ancient Greek, we don’t want to have to include 
>> U+2019 as part of the search request, but just the letters.
>>
>> As a reader of NT Greek, it doesn’t bother me to have δ αρχαια rather than 
>> δ’ αρχαια.
>>
>> BTW, if the filter’s code is changed and if the filter is used for searches, 
>> then all indexes of accented NT Greek modules will need to be rebuilt. The 
>> user’s search request has to be normalized in exactly the same way as the 
>> index was constructed.
>>
>> DM
>>
>>> On Mar 17, 2025, at 11:44 AM, David Haslam <dfh...@protonmail.com> wrote:
>>>
>>> Hi DM,
>>>
>>> One impact is on the StatResGNT module, in which both single and double 
>>> left/right quotation marks have been added by the project leader.
>>> Hiding Greek Accents has the bad effect of losing the end quotation mark 
>>> for all the level 2 quotations in the text.
>>> NB. It was seeing this project that prompted me to revisit this topic.
>>> It would be a real benefit to this module to make the change that I 
>>> proposed.
>>>
>>> Further to my initial thoughts late last week, I now agree that U+2019 is 
>>> the right codepoint choice to mark an elision.
>>> I was somewhat misled by the wrong answer given by Leo AI, which mistakenly 
>>> told me that it was a way to represent the iota subscript.
>>> It's only since quizzing Grok AI that my thoughts have become clear. I 
>>> admit that I should've known better, but I'm not a classicist.
>>> Yet the "category mistake" still exists - since an elision marker is not a 
>>> diacritic. And by definition, a Greek Accent is a diacritic!
>>>
>>> Making the proposed change to the filter should have a minimal effect upon 
>>> all the other Ancient Greek Bible modules.
>>> The number of wordsthus affected in a Greek NT module is not huge!
>>> There's really no downside to still displaying the "typographical 
>>> apostrophe".
>>>
>>> To illustrate, these are the only 21 words in TischMorph that end with 
>>> U+2019.
>>>
>>>> Word Count
>>>> Δι’ 2
>>>> Κατ’ 1
>>>> δ’ 22
>>>> δι’ 142
>>>> καθ’ 61
>>>> κατ’ 82
>>>> μεθ’ 43
>>>> μετ’ 132
>>>> μηδ’ 1
>>>> οὐδ’ 8
>>>> παρ’ 59
>>>> τοῦτ’ 17
>>>> ἀλλ’ 220
>>>> ἀνθ’ 5
>>>> ἀπ’ 119
>>>> ἀφ’ 44
>>>> Ἀλλ’ 1
>>>> ἐπ’ 143
>>>> ἐφ’ 82
>>>> ὑπ’ 25
>>>> ὑφ’ 9
>>>
>>> It's now my considered view that even when the Greek accents are hidden by 
>>> the filter, the elision marks ought to be retained.
>>>
>>> Best regards,
>>>
>>> David
>>>
>>> Sent with [Proton Mail](https://pr.tn/ref/SWXT9A5YZ67G) secure email.
>>>
>>> On Monday, March 17th, 2025 at 3:06 PM, DM Smith <dmsm...@crosswire.org> 
>>> wrote:
>>>
>>>> David, I read your Grok 3 analysis.
>>>>
>>>> What is the impact of not having this change? What is the impact of making 
>>>> the change? Is it merely presentation of is there an issue with searching 
>>>> too?
>>>>
>>>> I’ve also been reading 
>>>> https://corp.unicode.org/pipermail/unicode/2019-January/007563.html which 
>>>> was referenced in a prior recent thread on U+2019 in Ancient Greek. This 
>>>> is long and worth reading to understand how it might impact SWORD. The 
>>>> thread is initiated by James Tauber.
>>>>
>>>> TL;DR:
>>>> U+2019 (and in older texts U+0027) in Ancient Greek was never used for 
>>>> quotations and is only used for elision. It is considered the recommended 
>>>> character for elisions.
>>>> The Unicode rules (when the thread was written in January 2019) of TR29 
>>>> have that U+2019 is a word break when at the front or end of a word, but 
>>>> not within a word. It is not simply punctuation. These rules are not 
>>>> language aware.
>>>> There is no zero width character in Unicode to join words.
>>>> It is impossible for TR29 to distinguish between U+2019 used as a 
>>>> quotation mark and as an elision.
>>>> There is no other character that is an appropriate replacement for U+2019.
>>>>
>>>> I haven’t yet looked at Unicode TR30 regarding folding rules as it 
>>>> pertains to this.
>>>>
>>>> In Him,
>>>> DM
>>>>
>>>>> On Mar 17, 2025, at 8:46 AM, David Haslam <dfh...@protonmail.com> wrote:
>>>>>
>>>>> Dear SWORD developers,
>>>>>
>>>>> I asked about this topic several years ago, and I'm no longer convinced 
>>>>> by what we were told back then.
>>>>>
>>>>> After doing further research, it's my understanding that U+2019 RIGHT 
>>>>> SINGLE QUOTATION MARK ought not to be hidden by this SWORD filter.
>>>>>
>>>>> -  This codepoint is not a diacritic that modifies the previous Greek 
>>>>> letter. In other words, it's not a Greek accent.
>>>>> - This codepoint has the Unicode properties of a punctuation mark.
>>>>> - In Ancient Greek text, it's used to mark an elision, where the final 
>>>>> vowel of a word is omitted when the next word begins with a vowel.
>>>>>
>>>>> To view my research, conducted with the help of Grok 3, please visit the 
>>>>> following link.
>>>>>
>>>>> - https://grok.com/share/bGVnYWN5_43ff1922-3876-4d9a-9e42-6ae940007fd0
>>>>>
>>>>> I therefore recommend that SWORD developers revisit the specification for 
>>>>> this filter, and update it so that U+2019 is never hidden.
>>>>>
>>>>> Best regards,
>>>>>
>>>>> David
>>>>>
>>>>> Sent with [Proton Mail](https://pr.tn/ref/SWXT9A5YZ67G) secure email.
>>>>> _______________________________________________
>>>>> sword-devel mailing list: sword-devel@crosswire.org
>>>>> http://crosswire.org/mailman/listinfo/sword-devel
>>>>> Instructions to unsubscribe/change your settings at above page
>>>
>>> _______________________________________________
>>> sword-devel mailing list: sword-devel@crosswire.org
>>> http://crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to