On 09/12/2015 12:44 PM, Leif Halvard Silli (russisk.no) wrote:

Jumping to assumptions about what I meant?

Sorry but answering "XXE cannot load or save HTML, only XHTML" was too tempting.



It could, of course be that my expectations needs to be adusted a
little, as well ... However, when I said «use /XXE/ to /produce/ HTML» I
obviously did not mean /load/ HTML, and when I said 'HTML', I of course,
referred to "HTML" in "XXE lingo" - the XML-compatible, hopefully
HTML5-valid HTML one gets when one, via XXE’s "File|New" menu, picks
'HTML' in the 'HTML5' section of the dialog window that pops up.

Crystal clear.



XXE should of course not output - and need not support the parsing of -
tag omissions. But with regard to load and save of data-* attributes,
then, as far as I can tell, well-formed data-* attributes are parsed and
saved by XXE without trouble.

No, not at all. We don't know how to do that, as data-* prefixed attributes cannot be expressed in any standard schema language.




However, XXE’s validation service has a few anomalies:

 1. It stamps data-* attributes as invalid, which they per the HTML5
    spec are not - the choice of profile, 'HTML' or 'XHTML', does not
    matter in this regard.
 2. For 'HTML'-profile XHTML5 documents, its fails to inform that "HTML"
    documents containing processing instructions are not HTML5-conforming.
 3. The WAI-ARIA accessibility attributes is another group of
    HTML5-valid attributes, which XXE’s validation service stamps as
    invalid: http://www.w3.org/TR/html5/dom.html#wai-aria

For a comparison, the NU validator at http://validator.nu and at
http://validator.w3.org/nu offers several validation profiles - broadly
speaking it offeres one profile for HTML and another for XHTML - but
with several sub - and super - options. I believe the two NU validators
even differ, slightly, with regard to their HTML profiles. But both
validators adjust themselves - partly automatically, depending on the
file content. That they inform me about - or let me choose - the profile
in use. is essential: Many validation profiles are possible, both
extended profiles and subset profiles, as long as the validation lets
the user know which profile is in use.

OK.




Suggestions for the XXE valdation service:

 1. WAI-ARIA attributes (namely, the role="*" attribute pluse the
    various attributes prefixed with the "aria-" string) is meant for
    any HTML or XML document for human consumption. Thus, you could
    perhaps offer WAI-ARIA on/off button (inside Preferences -> Tools ->
    Validat(ion)) to enable such attributes anywhere - in SVGs, MathML,
    DocBook etc. (But note that, per the HTML5 specification, the exact
    place were those attributes are permitted, is, for various reasons,
    somehow convoluted. Hence, a on/off option, without further
    refinement - while such a thing would be both progress and
    acceptable, it would, I think, not completely match HTML5.)
 2. There could be a on/off button to allow data-* attributes in any
    flavour of HTML/XHTML. This makes sence, from the perspective the
    latest iteration of HTML (namely the HTML5 spec) since, for example
    the XHTML 1.0 strict doctype can be validated as a HTML5 document
    provided that the document otherwise conforms to HTML5 - the NU
    validators support this.
 3. When working with 'HTML' profile of XHTML documents:
     1. There could be a on/off button in the preferences to show a
        warning when processing instructions are in use. Since the
        latest iteration of HTML, namely HTML5, does not allow PI’s,
        this limitation can be applied to XHTML1 meant for text/html
        consumption as well - thus the limitation could apply to any
        HTML profile document.
     2. validation could, perhaps via the above on/off button, display a
        'orange' warning message whenever the document contains
        processing instructions = extension of HTML5.
     3. perhaps via on/off button the validator could disallow PI’s in
        'HTML' profile documents.


We would like to implement what you describe but we don't know how to do it.

As explained above, the schema languages (W3C XML Schema, RELAX NG) cannot express: "allow any attribute prefixed by strings such as 'data-' or 'aria-'".

OTOH, "3.2. 'orange' warning message whenever the HTML5 document contains processing instructions" is an easy one. A very simple Schematron (https://en.wikipedia.org/wiki/Schematron) could be used to implement this.





        How about adding a "sanitize for Web" function? What I mean is a
        saving/conversion feature which would save a copy of the current
        XHTML document as an HTML document and which would, during the
        conversion, also
        1. strip non valid features (such as PI’s),
        2. convert XML features to HTML features (e.g. change xml:lang="nn"
        to lang="nn" and change <?xml version="1.0" encoding="UTF-8"?> to
        <meta charset="UTF-8"/>
        3. save with the correct file extension

    If you use "File|New" and select the document template called
    "XHTML|5.0|HTML Page", you'll already have 2) and 3).

    1) could indeed be implemented by "XHTML|Preview".

Implementing 1) via XHTML|Preview sounds like a good idea!


Will do.

--
XMLmind XML Editor Support List
xmleditor-support@xmlmind.com
http://www.xmlmind.com/mailman/listinfo/xmleditor-support

Reply via email to