On 08/01/2012 01:11 AM, David Haslam wrote:
Three poorly defined uses of USFM verse tags that we often encounter are as
follows:

Spanned verses (translators use a verse range), thus

\v 7-11 Text for these five verses

or (worse still, a naughty use of the comma delimiter)

\v 3,4 Text for two consecutive verses

Split verses (translators using composite verse tags with parts a and b of
the text), e.g.

\v 19a Text for the first part of verse nineteen
\v 19b Text for the second part of verse nineteen

Based on my experience encoding USX, I believe these are all invalid. They're definitely invalid from the perspective of USX, where the verse values must be sequential numerals. But we necessarily need to treat them all as valid because a reasonable reading of the USFM documentation give no indication that these should be invalid.



btw. Peter and I have collected a substantial body of real world USFM suites
which you could probably use for testing your conversion script.

That's likely to be quite helpful soon, when I get to the point of writing regression tests. I've currently got the Open Bible Translation, WEB, and RV happily running through the script and generating valid OSIS. There are still 19 tags that usfm2osis.pl handles which I haven't addressed in usfm2osis.py. So I'd like to add handling for all of those tags, so that the new utility is at least as capable as the old. Then I may run it on Michael's collection of documents and complete coverage of its set of tags.

Then, I'll begin collecting markup samples, verifying that the script generates valid & reasonable output against those samples, and writing tests against the generated output so that we can be certain that future changes to the script do not break it.

Somewhere down the line I may incorporate xreffix.pl-type functionality as an option for those who have installed Sword bindings.

I'll definitely save the example USFM markup you included here and would welcome additional examples of potentially problematic markup.

--Chris


_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to