Chris,
When I compared nave.dat to the zip from biblecom.net, I found only a few differences. The Sword Module does not maintain the book order of the words, but rather alphabetizes them with what appears to be an ascii collation. And the cross references using the T0000456 notation are removed. Also, a special marker that preceded the T* reference and also a small handful of (n/a) (these were left in the text) were changed to something more printable, a chevron.

Also, in the original, there were two SIN entries, which are now SIN (1) and SIN (2) The first one is the dastardly deed and the second is the desert between Elim and Sinai. (I am wondering whether this is correct. Should a dictionary require uniqueness of entry names? Can we manage this with two different osisIDs, say SINdeed SINloc?)

My thought was that I would prepare a complete OSIS document and then build the module from that. In doing so, I think that it would be good to preserve the original book order of the terms. Will this cause a problem?

DM

Chris Little wrote:

DM Smith wrote:

Well I am starting to work on updating "Naves". I am trying to track down the source. I mentioned earlier that I found a copy of naves.zip at http://aibi.gospelcom.net/downloads/naves.zip. But bf.org no longer has any e-texts posted. I searched the Internet and all references to naves was to one or the other of these.

I took a close look at it and also at the naves module (using mod2imp). The one at gospelcom.net looks to be an older copy (has the phone for the Bible Foundation BBS while the module has the website address).

There are some other differences. While I have not gone over it line by line, it appears that the Sword module has additions and corrections.


Could you characterize the additions/corrections? An example or two would help too.

This brings up an interesting issue. I get the impression that Sword modules are a transformation of an original electronic text. In the case of Naves, the transformation is slightly lossy in that it throws away some markup. I think that it is important to preserve the source from which the modules are created. I would like to suggest that going forward that we try to do this for new modules. At least for those in the public domain.


Looking at the nave.dat file that ships with the module, I would suggest using that as a basis. I made this module before imp2ld or any of the other generalized import tools were written, so it uses a concatenated version of all the Nave's data files from bf.org. The indexer I wrote would have simply identified the title and the start & end points of each article and written them to nave.idx, leaving the whole .dat file intact.

At the time, Sword didn't support much of anything in the way of markup, except for #...| (indicating cross-references). The only other markup present in the original files that I see are italics for the titles (which aren't even an accessible part of the articles in the index) and the OLB topic hyperlinks (which were removed since they would just appear as garbage in Sword). If you're referring to the removal of the latter as lossiness, you need to bear in mind that they are simply a convention used in early versions of OLB. They would not be used in an OSIS encoded version of the document since OLB topic database numbers have no real meaning. The article title itself should be used for osisIDs as well as osisRefs to the entries with those osisIDs. (The topic numbers would be useful for creating the links themselves for an OSIS version, but there is no circumstance under which it would be appropriate to display them to users, and at the time there was no practical way to include them in the .dat but hide them from users.) If there was something else that was removed, I haven't noticed it.

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to