In OSIS a verse tag can either be a container, as in <verse>text</verse> or a marker as in <verse/>text. According to the spec, one or the other should be used for a work but not both.

This is needed to handle overlapping structures. Such as a verse that starts in one paragraph and ends in another. Another example is a quote that starts within a verse and spans several verses.

Alternatively, OSIS allows for these kind of text elements to also be markers.

This only becomes an issue if text is richly marked up with text elements.

I am not sure if sword cares which form is used.

[EMAIL PROTECTED] wrote:

Dear knowing ones,

There are a few things I am struggling just now with and wonder whether I could 
get some advice:

As described previously my text is some XML variety, the dump of paratext. 
Everything is marked up - which is good, but uses different tags than OSI - 
which is bad. I am in the process to change it over to osis, but as I can not 
yet script I must do things by hand - which is a bit grim.

q1) in an Osis prepared module do the verses need an osisID ? I reverted the Suaheli 
module (mod2osis) and found that only the chapters are tagged, while the verses appear to 
be simple a verse per line. I assume that the software counts the verses "by 
hand". Is this true?

q2) do the chapters need to have a complete osisID a la "Matt.1" or are there 
short versions possible - read in the Osis manual that a simple leading blank will be 
interpreted as referring to the current text, but teh refference is a bit ambiguous and 
not covered by an example.

q3) Currently the chapters are coded as <chapter value="1"> and the verses as <verse 
value="1">. A simple search and replace would need to be done at chapter level to get all verses 
coded or at book level to get all chapters coded properly, but a e.g. sed script would probably do this in a 
minute for the whole book.  Are there some sample scripts about which would do the above, which I could adapt? 
Also a regex would probably cover this but I am clueless in these too.

q4) the Bible is obviously in unicode with intermittent changes from l->r and r->l. The - to me - odd result of 
this is that each verse follows following scheme "<verse value="1"></verse> edhfoo 
fgfuwgfp " with teh text trailing the end marker. At least this what I see when I open the module in gedit, 
emacs and kate. is this normal and ok?

Thanks

Peter


____________________________________________________________
This e-mail has been scanned by the StreamShield Protector antivirus system. Doctors.net.uk is used by over 107,000 UK doctors.
____________________________________________________________



_______________________________________________ sword-devel mailing list sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel





_______________________________________________ sword-devel mailing list sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel

Reply via email to