Greg Hellings wrote:
DM,

On Mon, Mar 16, 2009 at 12:03 PM, DM Smith <dmsm...@crosswire.org> wrote:
I've been looking at the code regarding Alternate Versification (aka av11n
and v11n; I've seen these abbreviations by Troy, Chris and others).

It looks solid. The purpose of this note is to give it a big thumbs up.

Basically here is what I see: (Chris, Troy, correct me where I am off base!
Please!)
Today (1.5.11 and earlier) speed is a major consideration and canon.h
provides for that. The core functionality of looking up a verse or intro is
to convert a verse key into an offset in the module's index. Without going
into it in great detail, the module, testament, book and chapter
introductions are addressable in the index, as well as each verse.

In 1.5.12, canon.h no longer includes a fast lookup for this. Instead it
includes the KJV versification: books by name, number of chapters and number
of verses per chapter. The new VerseMgr takes this and dynamically builds
the old lookup table, hiding it behind it's API. The performance hit is
taken once each time the program is run for each versification scheme that
is requested.

Chris has taken the CCEL versifications and wrote a perl program that uses
them as input to generate the same structure for each versification.

Currently, the VerseMgr does not know about the different V11Ns. It looks
like that is all that is left for it.

If I am understanding this correctly, this leads me to believe that GenBooks
are not going to be used, but rather regular Bible modules. If this is true,
it is a boon to commentaries as well, as commentaries are structured
internally as Bibles. And it gives us compressed modules. And it gives us
the speed of the Bible module (GenBook is very slow in comparison.)

I had been concerned with GenBooks being used as osis2mod does
transformations and the gen book importer did not.

Has there been any moves to reduce the amount of transformations
osis2mod performs, so that the stored format is even closer to the
import format (preferably lossless)?
The answer is yes and no.

The goal of the transformation is:
   To allow for any valid, well written OSIS input.

There are a couple of purposes to the transformations:
1) To position interverse material (currently headings) either as introduction to a book or chapter or appending to the prior verse or creating "pre-verse" heading for the following verse. This is lossy. We have discussed, partially agreed upon a loss-less transformation. We have yet to implement it. (I think this should be part of the 1.5.12 release.)

2) To transform Book/Chapter/Section/Paragraph OSIS (which is best for OSIS authors) into BCV OSIS, which is best for applications.
   While this is lossless, it is not reversible.

3) To handle the Words of Christ in a way that works for a verse in isolation and also for the OSIS writer. (Verses are isolated in search results, in table cells used for parallel views, etc.)
   This is lossless, but not easily reversible.
  What prevents all Bible modules
from being stored in an inherently OSIS format internally with the
indexes created by the engine simply leveraging certain points in an
OSIS file, rather than in a separate binary format?  It seems that
would be the most lossless format available, but I'm curious as to
what technical issues might prevent that from being the most desirable
method?
There are a variety of reasons that it would not work. Here are some.
1) The SWORD and JSword engines cannot handle all possible OSIS inputs without major changes. 2) For any verse or passage it might not be a well-formed XML fragment. Without the complete fragment, a compliant XML parser cannot be used. A tolerant parser has to make guesses, which might be wrong. We would need to expand the reference in order to get a well-formed fragment. This might be computationally expensive. Pre-computing the well-formed context of a verse is a possibility. 3) Well-formed fragments might not have sufficient context to display properly. For example, Matt 6 is the middle of the Sermon on the Mount but just reading the OSIS markup for that chapter might not make it obvious.

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to