In the eXperimental module, all the words are already framed with a w element.
Thereβs been no discussion for how SWORD might render consecutive words
differently.
cf. We still donβt have an agreed implementation for Morph Segmentation.
IIRC, David Instone-Brewer once suggested that alternating colours might be a
way forward, but AFAICT, such suggestions have always fallen to the ground in
the SWORD developers community.
Please expand on what you had in mind, Peter.
Kind regards,
David
On Thu, May 29, 2025 at 17:27, Peter von Kaehne <[ref...@gmx.net](mailto:On
Thu, May 29, 2025 at 17:27, Peter von Kaehne <<a href=)> wrote:
> I think this has been discussed well.
>
> - this should be done on a semantic level and not with a kludge and a hack.
> - the obvious semantic solution is to frame words in w tags and then use
> CSS/trigger and option/whatever agreed from there.
>
> Sent from [Outlook for iOS](https://aka.ms/o0ukef)
> ---------------------------------------------------------------
>
> From: sword-devel <sword-devel-boun...@crosswire.org> on behalf of David
> Haslam <dfh...@protonmail.com>
> Sent: Thursday, May 29, 2025 3:47 pm
> To: sword-devel mailing list <sword-devel@crosswire.org>
> Cc: Modules Issues <modu...@crosswire.org>; steve.anti...@gmail.com
> <steve.anti...@gmail.com>
> Subject: [sword-devel] Fw: Repurposing U+2019 RIGHT SINGLE QUOTATION MARK as
> a Lexical Word Divider for the SE Asian scripts that have NO SPACE BETWEEN
> WORDS
>
> NB. I have cancelled the earlier email because the attachment was too large
> for sword-devel.
> It had been in the queue for moderator approval.
>
> The e Xperimental module KhmerNTx.zip may now be downloaded from this
> [link](https://app.box.com/s/e613wf1qdxbjmvux9gbb6vmes33d2rol) on my box.net
> account.
>
> Please see below for the significant details.
>
> Best regards,
>
> David
>
> Sent with [Proton Mail](https://pr.tn/ref/SWXT9A5YZ67G) secure email.
>
> ------- Forwarded Message -------
> From: David Haslam <dfh...@protonmail.com>
> Date: On Thursday, May 29th, 2025 at 9:26 AM
> Subject: Repurposing U+2019 RIGHT SINGLE QUOTATION MARK as a Lexical Word
> Divider for the SE Asian scripts that have NO SPACE BETWEEN WORDS
> To: sword-devel mailing list <sword-devel@crosswire.org>
> CC: steve.anti...@gmail.com <steve.anti...@gmail.com>, Modules Issues
> <modu...@crosswire.org>
>
>> Dear SWORD Developers (and our Modules Team),
>>
>> While watching the [livestream
>> funeral](https://www.youtube.com/live/zC4hXOgqBak?si=JZ7JiM7j_fHW-sQl) of OT
>> Scholar the late Gordon D Wenham yesterday (St Mary's Church, Charlton
>> Kings), I had a bright idea.
>>
>> I'd been working recently on potential improvements for the KhmerNT module
>> relating to marking the Lexical Word Divisions.
>> Khmer is one of the languages of SE Asia whose Writing System (aka Script)
>> largely has NO SPACE BETWEEN WORDS.
>> Others include: Lao, Thai, Myanmar (aka Burmese), together with other
>> languages in the region that employ one of these scripts (e.g. Isaan).
>>
>> Until the present, the KhmerNT module makes use of the ZWSP = Zero Width
>> Space to mark lexical word boundaries.
>> This helps with SWORD search for whole words, because even though the
>> divisions between words are invisible to human eyes, they are accessible to
>> computer software.
>>
>> Wouldn't it be nice if ... (cue to sing the melody by the Beach Boys) πΆ
>>
>> - We could instead use a visible Unicode character
>> - That character could be hidden by means of an existing SWORD filter
>>
>> There is such a character!!!
>>
>> - U+2019 is one of the codepoints hidden (or changed) by the filter
>> UTF8GreekAccents.
>>
>>> U+2019 (RIGHT SINGLE QUOTATION MARK) is commonly used in digital editions
>>> of the NT Greek as the apostrophe, not as a quotation mark.
>>>
>>> In NT Greek, it appears in:
>>>
>>> - Elisions: When a vowel at the end of a word is dropped (e.g., διβ instead
>>> of διά before a vowel).
>>> - Contractions or abbreviations: e.g., αΌΟβ for αΌΟΞ―, ΞΊΞ±ΞΈβ for ΞΊΞ±ΟΞ¬.
>>>
>>> While U+2019 is typographically correct for apostrophes in modern
>>> typesetting, some older or simpler digital texts may use U+0027 (straight
>>> apostrophe). However, U+2019 is the preferred character in high-quality,
>>> properly typeset Greek texts.
>>
>> I then set about to test my idea by making a further update to an already e
>> Xperimental version of the module, provisionally named KhmerNTx.
>>
>> It "worked like a dream". π
>>
>> With Greek accents hidden, the text looks like this:
>>
>>> αααα»ααααααα»α ααΆααΆαααααααααααααααΌαααα·ααα
>>> ααΌαα
αααααα½αα’ααααααααααααΆααα
αΆααααΆαααααΎαααΎα
>>> α αΎααααααΆαααααααααααααΆαα
ααααΆαααα
ααααααα’αΆαααααα
αααα»αααα»ααα»α αααα»αααΆα‘αΆααΈ
>>> αααα»αααΆαααΆααΌααΆ αααα»αα’αΆαααΈ αα·ααααα»ααααΈααΌααΆ (I Peter 1:1 [KhmerNTx])
>>
>> With Greek accents displayed, the text looks like this:
>>
>>> αααα»αβαααααα»α ααΆβααΆααβααααβααααβαααααΌβαααα·ααα
>>> ααΌαβα
ααααβαα½αα’αααβαααβααααααΆααα
αΆααβααΆαβααααΎαααΎα
>>> α αΎαβαααβααΆαβααααααααβααααΆβαα
βααααΆααβαα
βααααααα’αΆααααβαα
βαααα»αβααα»ααα»α
>>> αααα»αβααΆα‘αΆααΈ αααα»αβααΆαααΆααΌααΆ αααα»αβα’αΆαααΈ αα·αβαααα»αβαααΈααΌααΆ (I Peter 1:1
>>> [KhmerNTx])
>>
>> I have attached the compressed module for any of you to explore & play with
>> further.
>>
>> Aside: The previous update already made use of the OSIS XML w element to
>> enclose each lexical Khmer word. That remains the case.
>> In this way, the module source text is ready to be adapted for further
>> enhancements such as adding Strong's numbers, etc, to make a Study Edition.
>>
>> Steve Hyde and the translators in Cambodia are currently preparing to
>> publish the complete Khmer Bible.
>> He has requested my assistance in improving the actual word divisions for
>> the 39 OT books.
>> I've already been sent the source text, exported from their database.
>>
>> Since early May, I have been exploring how the Grok AI engine can make a
>> positive contribution to the success of this challenging task.
>> More on that subject later.
>>
>> Best regards,
>>
>> David
>>
>> Sent with [Proton Mail](https://pr.tn/ref/SWXT9A5YZ67G) secure email.
>
> _______________________________________________
> sword-devel mailing list: sword-devel@crosswire.org
> http://crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page