Hi everyone, As mentioned in other threads, I'd like to design a new, standard way to specify "versifications", that would allow to build custom versifications for each individual bible easily, while retaining and improving the ability to map and align different bibles with one another. I won't dive into actual format discussions yet : I was originally thinking of going for XML in order to easily integrate in OSIS headers, but using and extending the JSON format of the Copenhagen Alliance <https://github.com/Copenhagen-Alliance/versification-specification/> (if possible, even integrating directly with their standard) could be nice.
The goal of this mail is to present, from a "functional" point of view, the core principles on which such a versification system would be built. I try to use both generic terms applicable to any type of document ("reference system"), and explain how it concretely maps to OSIS bibles ("versification"). *Principle 1 : A "reference system" specifies only a set of unique IDs and a unique meaning for each ID.* A reference system is basically a map "ID -> Meaning". In two documents that use the same reference system, elements that have the same ID must have the same meaning. Concretely, for OSIS bibles, ID will typically be an OSIS ID, and the meaning will typically be defined by a specific text extracted from a reference bible. Between two bibles using the same versification, two elements with the same ID must be the translation of the same sequence of words. We could for example define a Rahlfs-LXX versification, backed by the URL to a place where the original text can be found <https://archive.org/details/alfred-rahlfs-the-septuagint-lxx-with-apocrypha-morphological-data> : this reference clearly defines the "meaning" for each ID. *Consequence 1-1 : Ordering defined by documents, not by the reference system.* This is an inconsistent design of the current versification system, which provides two possibly contradicting sources for the ordering of elements within a text. Any document obviously has a natural order : the order in which the elements appear in the document source. But the current sword implementation of versifications also attaches a specific ordering of elements to the reference system, which leads notably to versifications which differ ONLY by the order of the books (see MT and Leningrad). A clean reference system should not contain any notion of ordering : the only source of truth for ordering is the document itself. *Consequence 1-2 : No "compromised" or ambiguous versifications* Currently, some versifications are explicitely "compromised" (see LXX for example) in that they try to cover many possible bibles each with minor differences. Principle 1 requires a unique meaning for each ID, usually specified by a single reference text, preventing this. Similarly, in the current system, versifications are ambiguous, in that actual documents may or may not use "0" IDs for pre-verse canonical contents, like the psalm canonical titles in KJV. Principle 1 requires that each ID is explicitly defined : if a versification defines a meaning for ID 0, it must do so explicitly. *Principle 2 : A reference system may be defined as a subset of another reference system* In the previous example, if a Rahlfs-LXX versification is defined, we may define a Rahlfs-LXX-Psalms by considering only the IDs that belong to the book of psalms. *Principle 3 : A reference system may be defined as an aggregation of several others* In that case, all IDs defined in one of the underlying ref systems are valid in the resulting one, and map to the same meaning as in their original ref system. For example, if we have a Rahlfs-LXX-Psalms versification, a Vulg-Esth versification, and others for each book, they may be combined to build a versification covering a full bible. The only requirement here is the unicity of meaning for each ID : we can't aggregate two ref systems that define a common ID. We must first substract this ID from all aggregated systems except one, to remove any ambiguity. *Principle 4 : A reference system may be defined by a mapping table to another reference system* This mapping table defines the set of IDs defined in this new ref system (left hand side), and which IDs from the base ref system they correspond to (right hand side). "One-to-many" and "many-to-one" mappings should be possible - to represent verses that are split or merged between the base and new versifications. The general idea of the mapping table is similar to jsword's current versification mapping files <https://github.com/AndBible/jsword/blob/develop/src/main/resources/org/crosswire/jsword/versification/Catholic2.properties>, except that in jsword the right hand side is always KJV (or KJVA since my contribution to the AndBible fork). Here, it can be any other versification. *Practical application* The practical application of these principles leads to the following setup : - One specific "root" versification can be chosen by CrossWire and embedded in sword, to be used as central point for mapping. That could be KJVA (as it's already the current central point for mapping) by referencing a specific edition of KJVA. - A small set of "major" versifications are defined by CrossWire and embedded in sword, along with an accurate mapping to KJV. These major versifications should be for versions that we consider "very influential", ie many bibles mostly follow their verse splits (ex. Rahlfs LXX, possibly one MT version, etc.) - Finally, each bible can either reuse one of these major versifications directly, or embed a custom versification built by substracting/aggregating/mapping to any of the major ones. This allows each bible to accurately define its own versification without ambiguity, while still inheriting as much of the mappings as possible from the "major" versifications. For example, one bible may use Rahlf's LXX for all books except Esther, and define a specific versification for Esther with explicit mapping to KJVA. Other example : we no longer need to explicitly maintain NSRV and NSRVA : it's very easy for these bibles to just reuse KJVA with one small mapping for the only difference. And that's all for today, I think that description is long enough already ! Let me know your thoughts ! If we have a consensus on these principles, we can then start working on defining an actual format. Regards, Arnaud Vié
_______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page