Hi Greg, thanks for the ideas.   Have a look at this thread:

http://www.crosswire.org/pipermail/sword-devel/2005-April/022109.html

But let's please keep this thread pragmatic to solving the problem holding up this release :)

Any suggestions for WHAT the output should be?

I wasn't thinking of even including the <div> tags which are there simply to mark versification division.

Troy


On 05/08/2013 01:15 PM, Greg Hellings wrote:
Off the cuff here, it seems the issue is the difference in semantics of <div> between OSIS - where it marks a structural division within a text which can be of many different levels and layers and in XHTML where it represents a box of block-style layout which defaults to being the full width of its container.

Producing "proper" output seems like it is only feasible if we are handling a block of output. The sample you have contains 3 sID and 1 eID attributes on div elements. And they are self-closing elements, which will typically render as vertical whitespace in XHTML. Ideally, any with sID="..." would be rendered with <div> and any with eID="..." would be rendered with </div>.

The problem becomes rendering a list of 30 (or however many) verses, if each is rendered separately by our filters. If <div sID="gen1"/> is within Gen.0.0 but <div eID="gen1"/> is at the end of the chapter, which appears to be the case here, then we don't properly want to generate something like
<div>
 Gen.0.0
</div>
 Gen.1.0
 Gen.1.1
 Gen.1.2
 ...
But rather we want something like
<div>
 Gen.0.0
 Gen.1.0
 Gen.1.1
 Gen.1.2
 ...
</div>
At least when not dealing with inter-linear versions, we do.

In BibleTime we have discussed how to properly handle this and came up with an interesting solution that we engineered but never implemented. Our thought was to store information along with each verse which includes a pre- and post- verse markup. This would need to become part of the OSIS import process, and it would track the "semantically" open elements such as <div sID="gen1" /> which, by XML standards are no longer open but the OSIS semantics designate that div is open until <div eID="gen1" /> is encountered. This would be in addition to the actually open XML elements.

Every verse entry would then keep a store of the open elements at its start and those still open at the end of the entry. Then, when an arbitrary range is selected for rendering - say, Genesis 1:15-25 - a single, complete OSIS document could be generated by taking Gen.1.15.pre and appending that to the text of Gen.1.15-Gen.1.25 and then appending Gen.1.25.post. Then a proper filter can operate on the entire block of text to generate correctly wrapping <div> ... </div> and other markup.

Perhaps I overstepped the answer of what the above markup _should_ be, but I just wanted to toss out the solution that the BT folks have put brain power on to address the problem of stray open-and-close <div> elements. These seem to be the main problem in the sample you have presented. Again, there was never an implementation of this, as it would need to essentially re-import Sword module data to generate the pre- and post- data, and that went beyond the scope of any work heretofore on BibleTime.

--Greg

On Wed, May 8, 2013 at 2:31 PM, Troy A. Griffitts <scr...@crosswire.org <mailto:scr...@crosswire.org>> wrote:

    OK guys,

    I'm starting work on this. I've setup a test in our testsuite for
    whitespace against our OSIS reference doc. Here are the links:

    test:
    http://crosswire.org/svn/sword/trunk/tests/osistest.cpp
    (whitespace test added at the end)

    OSIS Reference Document:
    http://crosswire.org/svn/sword/trunk/tests/testsuite/osisReference.xml

    Before I start any work, I want to show what output we get
    currently. It is obviously seriously messed up.

    This is from the new XHTML filter set (which is based on the
    HTMLHREF filter set). The first obvious issue is the passthru of
    the OSIS <div> elements as-is. Anyone like to suggest exactly WHAT
    they would like as output from the XHTML filterset from the OSIS
    Reference document here? Current output below:


    <div sID="gen1" type="bookGroup"/> <h3>Old Testament</h3> <div
    osisID="Gen" sID="gen2" type="book"/> <h3>THE FIRST BOOK OF MOSES
    CALLED GENESIS</h3> <div sID="gen3" type="section"/>
    <h3>Introduction and Outline</h3> <br /> This is the <b>Book of
    Genesis</b>, the <i>first</i> book in the Bible. It may be
    outlined as follows: <br /><br /> <ul> <li><i>1</i>Creation of
    Heaven and Earth, 1:1-2:4a</li> <li><i>2</i>Creation of Man and
    Woman, 2:4b-25</li> <li><i>3</i>Fall, 3:1-24</li> <li>...</li>
    </ul> <br /><br /> Tables work like this: <b>Column 1 Label</b>
    <b>Column 2 Label</b> Column 1, Row 1 Column 2, Row 1 Column 1,
    Row 2 Column 2, Row 2 <br /><div eID="gen3" type="section"/>
    <div sID="gen7" type="majorSection"/> <h3>From Creation to Abraham
    (1:1--11:9)</h3>
    [ Genesis 1:1 ] In the beginning God created the heaven and the
    earth. <br />
    [ Genesis 1:2 ] Text of verse 2.




    _______________________________________________
    sword-devel mailing list: sword-devel@crosswire.org
    <mailto:sword-devel@crosswire.org>
    http://www.crosswire.org/mailman/listinfo/sword-devel
    Instructions to unsubscribe/change your settings at above page




_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to