Hi Troy/Greg, On Mon, Sep 17, 2012 at 11:49 PM, Troy A. Griffitts <scr...@crosswire.org>wrote:
> 3 brief points. > > The HTML filter set is old and no one I know of uses this filter set. > HTMLHREF, WEBIF, and XHTML are the 3 filter sets I know which are in use > today. I've started to switch SWORDWeb from WEBIF to the XHTML filter set. > Once this is done, I wouldn't mind deprecating both the WEBIF and HTML > filter sets. Eventually, I'd like to deprecate the HTMLHREF filter set, > leaving only one (XHTML) filter set we all use in common, but I know > xiphos, and others are still using this as the primary HTML output filter > set. > I know BPBible uses HTMLHREF, though we derive from it and make many changes to the output accordingly. You should be seeing <!P><br /> output from this <div type="paragraph"> > construct, not simply <!P>. Again, let's remove the <!P> if xiphos no > longer needs it. <br /> is certainly valid, even if not necessarily the > most desirable XHTML output for a paragraph division. > BPBible has in several places of the code calls to: data.replace("<!P>", "</p><p>") That is all the processing we seem to do on it. Jon > On 09/16/2012 01:54 AM, Greg Hellings wrote: > >> On Sat, Sep 15, 2012 at 5:11 PM, Troy A. Griffitts <scr...@crosswire.org> >> wrote: >> >>> Greg, >>> >>> Thank you for posting the issue. I'm still really having a tough time >>> understanding the problem. I know we've been crossing on IRC, so I'm not >>> sure if you are seeing any of my responses to you there. >>> >>> Anything you say while my Nick is in the channel is saved by ZNC and >> bounced to me the next time I login, up until I manually clear the >> logs. So yes, I've been getting the messages you've sent. >> >> We have code to hand these divs and not pass them through, as shown here: >>> >>> http://crosswire.org/svn/**sword/trunk/src/modules/** >>> filters/osisxhtml.cpp<http://crosswire.org/svn/sword/trunk/src/modules/filters/osisxhtml.cpp> >>> >>> search for "paragraph" and it should be like the 2nd or 3rd hit, but >>> there >>> is a comment which specifically shows your construct of <div eID="" >>> type="paragraph" /> >>> >>> The end result is that this get's output as <!P><br /> >>> >>> If you look below in your ./lookup output, you will see this exact >>> output. >>> >> That output is the result of FMT_WEBIF rendering. I'm not sure exactly >> what that is, so I can't speak to that. >> >> When I rebuild with HTMLHREF and XHTML I get <!/P>. This makes fine >> for HTMLHREF according to what Chris has said elsewhere and you state >> below as that is intended for use by GS/Xiphos. That does not make for >> acceptable XHTML - it is not valid. >> >> When I rebuild lookup with FMT_HTML I am still seeing the div tag >> passed through untouched. That is not valid HTML as discussed earlier >> in this thread unless we're hoping to target a very strongly >> discouraged construct of an older version of HTML. >> >> Strangely, I can't get the output of Diatheke and lookup to sync up on >> the XHTML results. >> >> The <!P> was added for/by gnomesword years ago and can be taken out if >>> you >>> do a grep through the xiphos code and find it not needed any longer. I'm >>> not sure why it was added. >>> >>> But, the end result is that we do process this construct and should never >>> pass it through. If Bibletime get's it to passed through, then they are >>> not >>> using our filters, either because they are using their own filter >>> distinct >>> filter set, or their filter set overrides this processing and doesn't >>> accept >>> our default processing. >>> >> The issue in BibleTime has already been taken care of. This only came >> to light because the offending <div> tags were in the preverse >> material which BibleTime does not pass through any filters but instead >> simply strips tags out of the raw text. I can't pretend to know what >> that is a good idea, but I'm not interested in that - only in getting >> my module looking correct. >> >> I figured I'd point out the discrepancies between SWORD's usages and >> the specs in the meantime. To that point, XHTML and HTML are still >> generating invalid output according to lookup. >> >> --Greg >> >> If you point me to an svn or git or whatever link to the Bibletime Render >>> Filter which processes OSIS, I'd be happy to have a look. >>> >>> Troy >>> >>> >>> On 09/15/2012 06:56 PM, Greg Hellings wrote: >>> >>>> To emphasize that we have an issue here, in the SWORD filters, here is >>>> the output from diatheke with HTML, HTMLHREF and XHTML (which support >>>> I just hacked in now in order to test). >>>> >>>> greg@Gateway08:~/Source/sword/**build (master)$ !diath >>>> diatheke -b TKE -o h -f HTMLHREF -k Gen 1:2 >>>> Genesis 1:2: Elaboya kayawomele naari kayanna dhego. Yaali mahinje >>>> ooddiiha ni owoopiha yahuruwedhiwe ni yiihi. Muneba wa Mulugu >>>> waviravira vadhulu va mahinje, osasanyedhelaga. <!/P><br /> >>>> (TKE) >>>> greg@Gateway08:~/Source/sword/**build (master)$ diatheke -b TKE -o h -f >>>> HTML -k Gen 1:2 >>>> <meta http-equiv="content-type" content="text/html; >>>> charset=UTF-8">Genesis 1:2: Elaboya kayawomele naari kayanna dhego. >>>> Yaali mahinje ooddiiha ni owoopiha yahuruwedhiwe ni yiihi. Muneba wa >>>> Mulugu waviravira vadhulu va mahinje, osasanyedhelaga. <div >>>> eID="gen11" type="paragraph"/><br /> >>>> (TKE) >>>> greg@Gateway08:~/Source/sword/**build (master)$ diatheke -b TKE -o h -f >>>> XHTML -k Gen 1:2 >>>> Genesis 1:2: Elaboya kayawomele naari kayanna dhego. Yaali mahinje >>>> ooddiiha ni owoopiha yahuruwedhiwe ni yiihi. Muneba wa Mulugu >>>> waviravira vadhulu va mahinje, osasanyedhelaga. <div eID="gen11" >>>> type="paragraph"/> >>>> (TKE) >>>> >>>> All three are outputting the same verse from the same module. HTML and >>>> XHTML are outputting <div eID="gen11" type="paragraph"/> which is what >>>> the module has in its rawest form. HTMLHREF outputs <!/P> which is not >>>> valid anything. There are other, odd, differences between the three >>>> but none of those are germane to this discussion, it would seem to me. >>>> >>>> $ ./examples/cmdline/lookup TKE Gen.1.2 >>>> ==Raw=Entry=============== >>>> Genesis 1:2: >>>> Elaboya kayawomele naari kayanna dhego. Yaali mahinje ooddiiha ni >>>> owoopiha yahuruwedhiwe ni yiihi. Muneba wa Mulugu<note n="1">1.2* >>>> <catchWord>Muneba wa Mulugu</catchWord> naari wi «pevo yuulubale.» >>>> Mulugu ohukalana muneba mmohi oneethanihu «Muneba Woweela.» Muneba >>>> Woweela ohukamihedha voopaddusiwa elabo. Mwaana a Mulugu, Yesu >>>> Kirisitu, teto ohukamihedha moopaddusa (Zhuwawu 1.1-3; aKolose 1.16; >>>> aHeberi 1.2.)</note> waviravira vadhulu va mahinje, osasanyedhelaga. >>>> <div eID="gen11" type="paragraph"/> >>>> ==Render=Entry============ >>>> .divineName { font-variant: >>>> small-caps; >>>> } .wordsOfJesus {color: red; } >>>> Elaboya kayawomele naari kayanna dhego. Yaali mahinje ooddiiha ni >>>> owoopiha yahuruwedhiwe ni yiihi. Muneba wa Mulugu waviravira vadhulu >>>> va mahinje, osasanyedhelaga. <!/P><br /> >>>> ========================== >>>> Entry Attributes: >>>> >>>> [ Footnote ] >>>> [ 1 ] >>>> body = 1.2* <catchWord>Muneba wa Mulugu</catchWord> >>>> naari >>>> wi «pevo >>>> yuulubale.» Mulugu ohukalana muneba mmohi oneethanihu «Muneba >>>> Woweela.» Muneba Woweela ohukamihedha voopaddusiwa elabo. Mwaana a >>>> Mulugu, Yesu Kirisitu, teto ohukamihedha moopaddusa (Zhuwawu 1.1-3; >>>> aKolose 1.16; aHeberi 1.2.) >>>> n = 1 >>>> >>>> On Fri, Sep 14, 2012 at 7:15 PM, Chris Little <chris...@crosswire.org> >>>> wrote: >>>> >>>>> >>>>> On 09/14/2012 01:02 PM, Greg Hellings wrote: >>>>> >>>>>> So I've been debugging a module display problem in BibleTime. I >>>>>> mentioned it on IRC with Troy the other day but we weren't able to >>>>>> connect at the same time to discuss further. The issue has to do with >>>>>> paragraph tags - in osis2mod these tags are being converted from <p> >>>>>> to <div sID="someid" type="paragraph" />. >>>>>> >>>>> This is extraordinarily bad. This is a change in semantics, because <p> >>>>> and >>>>> <div type="paragraph"> are not semantically equivalent. >>>>> >>>>> <p> marks the type of paragraph we all probably think of first: >>>>> generally, a >>>>> chunk of text with newlines before and after. >>>>> >>>>> <div type="paragraph"> marks a formal division within a text that >>>>> happens >>>>> to >>>>> be identified as a 'paragraph' and may consist of multiple <p>-type >>>>> paragraphs. Examples of these divisions are found in many laws and the >>>>> Catechism of the Catholic Church (which does exist in OSIS form). >>>>> Here's >>>>> part 1, section 1, chapter 1, article 1, paragraph 1 of the CCC: >>>>> http://www.vatican.va/archive/**ENG0015/__P16.HTM<http://www.vatican.va/archive/ENG0015/__P16.HTM>. >>>>> As you can see, it >>>>> consists >>>>> of many <p>-type paragraphs but is a single <div type="paragraph">-type >>>>> paragraph. >>>>> >>>>> Abhorrent though I consider milestoned <p/>, I think I would much >>>>> prefer >>>>> to >>>>> see us map <p>...</p> to <p sID=""/>...<p eID=""/> than see us clobber >>>>> the >>>>> semantics of a defined <div> type. >>>>> >>>>> >>>>> Thus, osis2mod is in violation of the suggested XML best practice by >>>>>> creating a non-EMPTY tag as self-closing but this is seemingly pretty >>>>>> common in the OSIS world. Furthermore our filters are producing >>>>>> invalid (or very strongly discouraged) HTML as per every still-in-use >>>>>> version of the specs (HTML4, XHTML, HTML5). As such, I'm of the >>>>>> opinion that this represents a bug in SWORD - at the very least in the >>>>>> filters that permit empty, self-closing div tags to slip through what >>>>>> are supposedly HTML outputs. Do others agree or disagree on this? >>>>>> >>>>> I'm of the opinion that our OSIS is generally fine, meaning we should >>>>> go >>>>> ahead and keep allowing self-closing OSIS tags if possible (as input >>>>> and >>>>> output from osis2mod and as content of modules not produced by >>>>> osis2mod). >>>>> This is just a recommendation and specifically a recommendation for the >>>>> purpose of aiding processing with legacy SGML tools, which I can't see >>>>> us >>>>> doing and don't personally care about. (The semantic violation noted >>>>> above >>>>> is a bug in my mind, but that issue is orthogonal.) >>>>> >>>>> I would agree that the filter output is buggy if we're generating >>>>> disallowed >>>>> tag forms. OSIS <div> and <p> would need to be translated to their >>>>> correct, >>>>> non-self-closing HTML forms. Beyond those two, I can't think of any >>>>> tags >>>>> that have the same form & general semantics in both OSIS & HTML. >>>>> >>>>> --Chris >>>>> >>>>> >>>>> >>>>> ______________________________**_________________ >>>>> sword-devel mailing list: sword-devel@crosswire.org >>>>> http://www.crosswire.org/**mailman/listinfo/sword-devel<http://www.crosswire.org/mailman/listinfo/sword-devel> >>>>> Instructions to unsubscribe/change your settings at above page >>>>> >>>> ______________________________**_________________ >>>> sword-devel mailing list: sword-devel@crosswire.org >>>> http://www.crosswire.org/**mailman/listinfo/sword-devel<http://www.crosswire.org/mailman/listinfo/sword-devel> >>>> Instructions to unsubscribe/change your settings at above page >>>> >>> >>> >>> ______________________________**_________________ >>> sword-devel mailing list: sword-devel@crosswire.org >>> http://www.crosswire.org/**mailman/listinfo/sword-devel<http://www.crosswire.org/mailman/listinfo/sword-devel> >>> Instructions to unsubscribe/change your settings at above page >>> >> ______________________________**_________________ >> sword-devel mailing list: sword-devel@crosswire.org >> http://www.crosswire.org/**mailman/listinfo/sword-devel<http://www.crosswire.org/mailman/listinfo/sword-devel> >> Instructions to unsubscribe/change your settings at above page >> > > > ______________________________**_________________ > sword-devel mailing list: sword-devel@crosswire.org > http://www.crosswire.org/**mailman/listinfo/sword-devel<http://www.crosswire.org/mailman/listinfo/sword-devel> > Instructions to unsubscribe/change your settings at above page >
_______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page