From a practical perspective, SWORD or JSword only look at canonical on titles 
to determine whether hiding titles and intros (aka headings) should include 
them.

Just a few comments about my understanding. Probably a purist;) From the manual 
(v2.1.1), page 18.
> When canonical="true", it means that the content of that element is a part of 
> the text being encoded.
...
> It should be explicitly noted that the value of the canonical attribute 
> should not be used to reflect theological judgment about the content of a 
> text, but merely to distinguish between what has been added to the text and 
> what has not. 
> 
> In most cases use of the canonical attribute is straightforward, and the 
> default values will almost always produce the intended result. However, there 
> will arise truly difficult cases: for example, one may be encoding an ancient 
> text with annotations of its own. In that case those notes would be 
> canonical, while any added by the current editor would not be. In such cases, 
> the practice chosen and its rationale should be described in the work's 
> documentation.

So, I take this that if I were creating an accurate representation of the 1611 
KJV from scans, everything in that "ancient" text would be canonical, including 
introductions, notes, titles, cross-references, and so forth.

If it is not that way and it is to reflect the underlying publication then I 
think there is a problem with the usage of the <transChange type="added"> 
element . In this case these should be marked canonical="false" as they are not 
part of the "base" text.

I took out the example about notes in a Bible translation. Its intent is that 
canonical is to distinguish what was in the text the translation was based from 
what was not in that base.

The confusion is that it is not at all clear what current editor means. There 
are many who take the KJV, notes and all, make changes to it, say modernizing 
the spelling, translate it into another language, .... So, since their base is 
not the Hebrew and Greek, but a particular KJV text, then according to this 
definition, the imported notes are now canonical.

But as a module encoder, I'd do it the way the OSIS defaults are, with one 
exception: The <div> element.
> The canonical attribute is available on all elements. 
> 
The following elements without canonical:
osis
osisCorpus
teiHeader
work
workPrefix

> It has a ‘default’ value so it does not have to be entered by the encoder if 
> the default value is acceptable. 
> 

A bit misleading. Only a few (8) element actually have a default. Note, chapter 
is not there. And having it on osisText is silly (see below).
Default: true <xs:attribute name="canonical" type="xs:boolean" use="optional" 
default="true"/>
osisText
verse

Default: false <xs:attribute name="canonical" type="xs:boolean" use="optional" 
default="false"/>
header
div
note
reference
title
titlePage
> The value of this attribute is "inherited," that is once it is set, any 
> subelement of that element inherits the same setting. 
> 

Default: inherited <xs:attribute name="canonical" type="xs:boolean" 
use="optional"/>
The rest of the elements.

The examples on the same page are confusing, as they don't fit with the XML 
inheritance mechanism. They have an explicit value on a parent element forcing 
the inclusion of the attribute on an element with that as a default. Having a 
default value means that that element never inherits the value.

With inheritance, it should be possible at any point in the document, using an 
XML parser to ask what the value of canonical is.

However, the attribute "canonical" is not actually inheritable, according to:
http://www.w3.org/TR/2009/WD-xmlschema11-1-20090130/#Inherited_attributes
> 3.3.5.6 Inherited Attributes
> 
> Schema Information Set Contribution: Inherited Attributes
> [Definition:]  An attribute information item A, whether explicitly specified 
> in the input information set or defaulted as described in Attribute Default 
> Value (§3.4.5.1), is potentially inherited by an element information item E 
> if and only if all of the following are true:
> 1 A is among the [attributes] of one of E's ancestors.
> 2 A and E have the same [validation context].
> 3 One of the following is true:
> 3.1 A is ·attributed to· an Attribute Use whose {inheritable} = true.
> 3.2 A is not ·attributed to· any Attribute Use but A has a ·governing 
> attribute declaration· whose {inheritable} = true.
> If and only if an element information item P is not ·skipped· (that is, it is 
> either ·strictly· or ·laxly· assessed), in the ·post-schema-validation 
> infoset· each of P's element information item [children] E which is not 
> ·attributed to· a skip Wildcard, has a property:
> PSVI Contributions for element information items
> [inherited attributes]
> A list of attribute information items. An attribute information item A is 
> included if and only if all of the following are true:
> 1 A is ·potentially inherited· by E.
> 2 Let O be A's [owner element]. A does not have the same expanded name as 
> another attribute which is also ·potentially inherited· by E and whose [owner 
> element] is a descendant of O.
> 
I presume this is a bug in the OSIS Schema.

From a practical perspective in encoding a whole document, there are two 
scenarios to consider:
1) Milestoning structural elements. (BCV: Book, Chapter and Verse encoding)
2) Milestoning verses. (BSP: Book, Section and Paragraph encoding, recommended)

First the text of the work has to be within (using my notation)
<osis><osisCorpus>(<osisText>(<header>...</header>)*(<titlePage>...</titlePage>)?(<div>CONTENT</div>)+</osisText>)+</osis>
or
<osis>(<osisText><header>...</header>(<titlePage>...</titlePage>)?(<div>CONTENT</div>)+</osisText>)+</osis>
(Note: osis2mod expects only one osisText)

The significant part is the <div>, it cannot be a milestoned form and pass 
validation. The default value of canonical on this element is "false". 
Therefore, all descendants not contained in elements whose default is "true" or 
that explicitly declare canonical="true" inherit the value "false".

Because, divs can be nested, each div resets the state of canonical, either to 
its default of false or to the declared canonical value.

The fact that <osisText> defaults canonical to true is meaningless. All of its 
children have a default of false. So practically speaking, the only element 
with canonical="true" is a verse and its contents that don't have 

The other implication of using the non-milestoned form of <div> is that by OSIS 
semantic, all other <div>s have to be container elements not milestoned. (I can 
quote the OSIS 2.1.1 manual, if needed). Personally, I think this is too broad 
a semantic for <div> and should take into consideration the type attribute.

In case 1), where the document uses the container form for Books (<div 
type="book">), <chapter> and <verse> and uses as needed or semantically 
required, the milestoned form of other container, the intention of the OSIS 
manual is preserved. The defaults work as intended.

However, in case 2), where the verse is milestoned the text and other elements 
of the verse is not a child of the verse element but rather the container that 
it is in, typically a paragraph or a div. By the rules of XML (if inheritance 
were properly specified), the parent container would need to explicitly give or 
inherit canonical="true".

With regard to SWORD and JSword, they always work on a fragment of the whole 
document and might not have the parent on which to determine whether canonical 
is true or false. Practically, they assume true.

If the OSIS schema had the default of canonical on <div> to be true or if it 
were optional (making the default on osisText meaningful), there would be no 
issue.

This is to say, I think the OSIS Schema has it wrong for a <div>. Until or 
unless it is changed, one nearly always has to have canonical="true" on a div.

In Him,
        DM

On Feb 29, 2012, at 2:46 PM, Troy A. Griffitts wrote:

> Sorry to only jump in on problems, but...
> 
> I don't believe the preceding explanation of 'canonical' is correct.
> 
> OSIS defaults many attributes to canonical, including <verse> and <chapter>
> 
> I believe we defined canonical as text belonging to the base work.
> 
> For us, this is mostly Bibles.
> 
> For a study Bible, it would exclude all commentary and notes, and only 
> include Biblical text.
> 
> Basically, canonical for the Open Scripture Information Standard refers to 
> Biblical text, and you'd be hardpressed to use it for anything else 
> practically, though I could see a purist trying to make an argument for it.
> 
> For example, Josephus would only include the text of Josephus.
> 
> And while technically true, the practical uses for 'canonical' are things 
> like:
> 
> Showing Psalm titles even when the user has asked not to show 'titles'
> Searching typically is only over 'canonical' text
> 
> -- but we usually work the opposite way: we take out notes, xrefs, headings, 
> and index what is left, so the Josephus example isn't practically a problem 
> for us right now (plus I think our Josephus module only contains Josephus 
> text).  And this is simply for indexed searching.  Our full text searching 
> allows for your to search any of these other field: notes, xrefs, headings, 
> just about anything in an entry attribute.  We have talked about providing 
> indexed searching for some of these things, but really? how often do you 
> search the notes?  Just wait the 4 seconds to do the unindexed search.  But 
> we have lots of future ideas of how to modularize the search framework so a 
> frontend could supply a filter which outputs what to include in a named 
> lucene index. Anyway, tangent...
> 
> 
> Summary,
> <verse> already indicates canonical material by default
> Psalm titles, being canonical and usually not within a verse (unless it's a 
> v11n which includes them in a verse), need to be marked specifically as 
> canonical.
> 
> If the OSIS docs say different, let me know and I'll poke the editor.
> 
> Troy
> 
> 
> 
> On 02/29/2012 07:11 PM, David Haslam wrote:
>> Thanks DM,
>> 
>> Someone like to volunteer to enhance usfm2osis.pl to ensure that
>> canonical="true" is set as it should be?
>> 
>> David
>> 
>> --
>> View this message in context: 
>> http://sword-dev.350566.n4.nabble.com/Setting-canonical-true-tp4432196p4432418.html
>> Sent from the SWORD Dev mailing list archive at Nabble.com.
>> 
>> _______________________________________________
>> sword-devel mailing list: sword-devel@crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
> 
> 
> _______________________________________________
> sword-devel mailing list: sword-devel@crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to