> On Jan 2, 2016, at 2:54 PM, David Haslam <dfh...@googlemail.com> wrote:
> 
> Please visit http://paratext.org/about/usfm
> [snip]

David, thanks for your assistance. Indeed, I've already become fairly familiar 
with the 2.4 USFM spec, and my attempts to implement that specification in a 
PEG grammar are what prompted my questions. I however, am not familiar with 
actually authoring USFM files. I also do not presently have access to a copy of 
Paratext to experiment with, and from the registration form I'd surmise that I 
may not be able to get access to it.

They may seem like silly questions, but I cannot find any specific evidence to 
assume one way or the other from in the spec.

For instance, in the usfm texts that I've seen, there have been _no_ lines, 
apart from blank lines, that do not begin with a marker of some kind. Is it the 
case that a line will _always_ start with a marker? The spec is not clear.

Typically, I'd assume that markers were intended to be an _addition_ to the 
plain text, but the examples I've seen seem to point to empty lines likely not 
being of any semantic significance, which indicates against it.

The definition of a marker, the only formal definition I can find for it, is 
that it goes from a '\' (backslash) to the next ' ' (space). Unfortunately, 
this is not sufficient for two reasons. The first is that a marker may be on 
it's own line, and a newline immediately following, without the space required 
by the definition. The second is that more parsing than that must be done to 
identify an specific marker, as each marker has its own requirements for the 
text that may follow it, and some markers must be used together (specifically, 
those with matching ending markers).

I hope that I've convinced you that I am doing the required work to understand 
USFM, and that my questions are coming at an appropriate time as to at least 
attempt to not waste your time needlessly. They are targeted to specific 
situations, that I think will give me the best insight as to how I should be 
looking at USFM markup.


1. Is text allowed to be on a line _without_ a marker starting the line?
2. Are blank lines semantically meaningful? That is, if all the blank lines are 
removed, does the file mean _exactly_ the same thing?
3. Are the non-text markers (one that don't have the ending form( \usfm* ) 
required at the beginning of all meaningful lines?
4. Is only one non-text marker allowed per line?
5. Must a non-text marker be only at the beginning of a line?
_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to