Hi Arnaud,
Too many questions to answer all , but some below.
- About half of the conf file entries are derived from the module and we have tools to automatise this , but conversely the other half is not. FWIW, the derived part is mostly that which is needed to function , while the not automatic one is needed to inform the reader .
- Modules can have a large number of versifications which means that your blob of text in the last verse goes away, mostly.
- There have been discussions re OSIS direct and the thought was this not useful .
From: sword-devel <sword-devel-boun...@crosswire.org> on behalf of Arnaud Vié <unas.zole+a...@gmail.com>
Sent: Tuesday, July 2, 2024 11:04 PM
To: SWORD Developers' Collaboration Forum <sword-devel@crosswire.org>
Subject: [sword-devel] Fwd: Making it easier to import OSIS documents in sword
Sent: Tuesday, July 2, 2024 11:04 PM
To: SWORD Developers' Collaboration Forum <sword-devel@crosswire.org>
Subject: [sword-devel] Fwd: Making it easier to import OSIS documents in sword
Hi all,
Has anyone given any thought to simplifying the import of OSIS documents in sword ?
With my bible-scraper, I'm giving users a way to easily generate OSIS documents.
The next step is to allow them to easily import the resulting document in sword... But the current process is quite painful in this regard :
The next step is to allow them to easily import the resulting document in sword... But the current process is quite painful in this regard :
- Usage of osis2mod CLI, with relatively obscure options, and manual writing of a module conf file, is reserved to a "technical elite".
Unless I'm missing something, non-technical users have no easy way to import an OSIS document into sword. - Even if I want to develop a simpler frontend hiding this complexity, ideally browser-based, osis2mod being distributed as a binary makes it hard to integrate into a portable frontend to automate the process.
Strategy 1 : Rewrite or recompile osis2mod in a more portable fashion
For example, it may be possible to represent most of the XML structure changes done by osis2mod (described here, implemented here) as an XSLT sheet or similar. This would make it easy to write portable osis2mod implementations (in java, JS...) without duplicating the maintenance for all this transformation part.
A smaller impact variant would be to keep the osis2mod code mostly unchanged, but compile it into a WASM module using emscripten, that could be executed natively by web browsers. I have yet to try this, though.
Strategy 2 : Allow libsword/jsword to consume OSIS documents directly
OSIS is a well-documented, mostly well-specified and readable open format, whereas "sword modules" are much more tied to one specific implementation (osis2mod).
By accepting OSIS documents in input, instead of only sword modules, we would be moving from a mostly closed environment to a truly open one.
I understand that the transformations/normalisations/indexes computed by osis2mod have a purpose to improve the runtime efficiency of accessing the bibles (not decompressing and loading in RAM a full bible all the time, etc.), so I'm not suggesting we completely get rid of them.
However, they could be taken care of at "module installation" time by the lib itself.
The lowest-impact change for libsword would be :
- Embed osis2mod logic into libsword core
- Update InstallMgr::installModule to no longer require a "mods.d", but also accept archives containing a single OSIS XML document.
In that case, plug the call to osis2mod logic to process the OSIS document and generate the actual modules.
With this, the installation of a such an OSIS module would take a few more seconds than for the usual modules, but in exchange would make the whole ecosystem easier to interact with
The problem here, of course, is that we'd have to duplicate that logic into jsword - unless we're also making it more portable as per solution 1.
What are your thoughts on these two strategies ?
I'm also interested in any historical insight on this sword module format, which at first glance seems much more complex than it needs to be.
For example :
- What is the purpose of offering multiple compression formats ? (half of which are not supported in the debian libsword builds by the way)
- Why does osis2mod force bibles to fit into a versification (squashing all remaining texts into the last verse of a chapter) instead of building a specific index that accurately represents the contents of the original OSIS document ?
- Why are contents always split by testament (ot/nt.bzs/v/z) ? Seems a bit arbitrary, especially since OSIS allows any kind of bookGroups.
Thanks, and sorry for yet another very long email !
Cheers,
Arnaud
_______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page