YAB (yet another beta) (downloadable from the links at the bottom of http://www.crosswire.org/~dmsmith/kjv2006)

Again, I really value your input and the time you take to evaluate these betas.

The process by which I am working is that each beta does 2 things:
fixes problems found with the changes by the previous betas
fixes a new class of problems

At any point, I think we can "deliver" the module. Also, it is a simple thing to undo any of the changes. So if we decide to hold off on any of the changes, that's fine.

I think a goal (well it is at least mine) should be to get this out as soon as possible. My motivation is that using JSword, the encoding errors in this module make BibleDesktop look especially bad. Since this is the most downloaded module, I think this is an important goal. I don't mind postponing some of the changes to reach this goal.

I have found differences between the module and the printed text, where the module agrees with the other (two) etexts. So short of proof-reading against the printed text, there may unknown differences.

This beta cleaned up the problems reported against the last beta (most related to apostrophes), (I still have to apply the fixes to those that are in OT notes.)

I have found and fixed more punctuation problems.

I have compared all the differences between the text of this module and the printkjv and Tim Lanfear's CCEL work. I checked each of these against the Old Scofield, using it as the final arbiter.

There are three significant changes for this beta:
This one fixes words that appear in italics.

This one also fixes titles and adds missing book titles. The only titles that I am aware I have not done are the psalm books I-V. While I may have made mistakes, I have verified each of these against the "original". I have preserved the case and punctuation of the original, but have not attempted to add line breaks to book titles. The only encoding of titles that I am not sure about are the Psalm 119 ALEPH., BETH. ... titles.
       They should print before the verse.
I have made these to be titles with in the verse, as the nature of an OSIS title is that titles the element that "contains" it. In this fashion, they are all subType="x-preverse", but I have not marked them as such.

This one also starts the fixing of hyphenated names.
   The KJV2003 edition is fairly uniform in not having hyphenated names.
   However, every printed copy of the KJV that I have uses them.
My take is that we need to preserve the "jots" and "tittles" so I am adding these back. So, I have taken a list of names that I got from Tim Lanfear (who did the CCEL KJV module) as a start. I have changed all of them according to his list and am now validating them, verse by verse. (I am in the B's. That's why I say this is a start.) Interestingly, I have found that a name is not uniformly hyphenated. (e.g Abi-ezer is hyphenated about half the time. Beth-lehem is hyphenated in the OT but not the NT)
   I am using an en-dash to encode the hyphen (U+2013)
We may want to change the SWORD engine to handle hyphenated words better. (e.g. lucene indexing and searching)
   A couple of interesting things I have just found out:
Tim Lanfear pointed out to me that in the Hebrew the hyphen is a special character. Some of the English hyphenated names are separate words in the Hebrew. Strongs may have more than one number for a hyphenated word with each part having its own (e.g. Bar-jona and Bar-jesus in the NT)

Next steps:
Finish validating the hyphenated names. (this may span one or more betas) See how lucene handles the indexing of hyphenated words using an en-dash and minus. And report the results here. (I am thinking that a minus is seen as word break but and en-dash is not) Fix the <divineName> encodings. Sometimes these encompass more than just the divine name. Also, the print versions typically use "small-caps" and render Lord not LORD with it. This appears to have been a tradition since the 1611 printing. I think it has been the tradition of etexts to use LORD as a way of rendering small caps.
       But with the explicit markup of OSIS this is not necessary.
However it may be necessary for the front-ends to change to accommodate this. It might also be nice to change the SWORD engine to mark in the lucene index the verses containing the divine name and allow searching on LORD (i.e. some i18n marker. e.g. HERR auf Deutsch, SEIGNEUR en francais, SENIOR en espanol...) to find those verses.
   Validate the paragraph marks. (There are more than there should be.)

And many thanks to those that have e-mailed me lists of verses that need to be fixed. Special thanks to Tim Lanfear for his detailed feedback and Terry Biggs for the SWORD engine changes


_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to