Title: signature
There is one more bizarre break with USFM syntax: the vertical bars in the \fig ...\fig* marker. There must be exactly 6 of those, separating 7 fields. Four of those fields are marked in the specification as being mandatory, but don't count on them being nonzero length. Until recently, Paratext never enforced the structure and contents of those fields.

If it seems that USFM is more complex than it needs to be, your assessment agrees with mine. It is that way, however, because of the history of the evolution of the standard, with changes made incrementally, trying to not invalidate large numbers of existing texts in USFM. For example, the character style terminator marker was an afterthought. The ability to nest styles was an even later afterthought. Both of those potentially conflicted with the pure flat one character style allowed to be active at a time philosophy that worked for so many simple Bibles for so long. Now, however, there are many examples of nesting of character styles in real-world Bibles.

I hope all of this helps. Your questions, Ryan, are good ones.

On 01/02/2016 03:34 PM, Kahunapule Michael Johnson wrote:
On 01/02/2016 12:50 PM, Ryan Hiebert wrote:
The definition of a marker, the only formal definition I can find for it, is that it goes from a '\' (backslash) to the next ' ' (space). Unfortunately, this is not sufficient for two reasons. The first is that a marker may be on it's own line, and a newline immediately following, without the space required by the definition. The second is that more parsing than that must be done to identify an specific marker, as each marker has its own requirements for the text that may follow it, and some markers must be used together (specifically, those with matching ending markers).

MOST USFM markers start with "\" and terminate with white space (space or newline) or "*". There are two oddball official markers that don't follow this pattern: "~" and "//". There is also the unofficial but widely used shortcut of "<" for "‘", ">" for "’", "<<" for "“", and ">>" for "”". If a marker has an end marker, it is the same as the beginning marker, but with the terminating space or newline replaced with "*". Note that the space after "\nd " is part of the marker, but the space after "\nd*" is not part of the marker but part of the text. This little bit of detail is important in avoiding adding spaces where they don't belong, such as in the Khmer language.

You can try to get access to Paratext. It might work. Either way, there is another USFM editor that works almost the same: Bibledit. You can get that for free from http://Bibledit.org.

--

Aloha,
Kahunapule Michael Johnson

MICHAEL JOHNSON
PO BOX 881143
PUKALANI HI 96788-1143

USA
eBible.org
MLJohnson.org
Mobile: +1 808-333-6921
Skype: kahunapule


--
Your partner in electronic Bible publishing,

MICHAEL JOHNSON
PO BOX 881143
PUKALANI HI 96788-1143

USA
eBible.org
MLJohnson.org
Mobile: +1 808-333-6921
Skype: kahunapule

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to