On 12/29/20 1:59 PM, Leif Halvard Silli wrote:
I assume that the following issue is well known to the XXE developers as well as to seasoned XXE users - steps to reproduce:

We certainly know about this behavior but do not consider it to be an issue per se.

For us, the only way to improve the situation would be to implement an "HTML mode" in XXE, which unlike any other XXE configuration, would not strictly conform the underlying schema when it comes to indentation and un-indentation. Explanations below.

We currently don't plan to implement such "HTML mode". Sorry.




 1. In XXE, create an HTML document with two non-empty subsequent <p>
    elements as children of the <body> element.
 2.

    Inspect the source code of the <body> element - it should look
    something like this:

    <p>Abc.</p><p>Def.</p>

    (Then main thing being that XXE will automatically make sure there
    is zero whitespace between "</p>" and "<p>".)

 3.

    Now open or edit the document in another - XML-respecting - HTML
    editor. (For instance BlueGriffon.)

 4.

    Save the document. And then inspect the code - the line above has
    almost guaranteed probably look something like this:

    <p>Abc.</p>
    <p>Def.</p>

    (Whitespace has been added between "</p>" and "<p>".)

If <body> may contain text (and in such case, whitespace may NOT be added to indent paragraphs), then BlueGriffon is incorrect (at least XML-wise).



 5.

    Reopen the HTML document XXE again.

 6.

    XXE will now reformat the code - which is in principle is totally OK
    (even if we know that many would prefer that it did not). However,
    even if XXE reformats the code, it *will not remove* the whitespace
    between "</p>" and "<p>". This is, in principle, also totally OK.
    However, because XXE has a very specific way to render whitespace
    "in between (certain) elements", this is still not really OK.

  * EXPECTED RESULT: Visually, the document should look the same.
  *

    ACTUAL RESULT: Visually, an empty line is drawn between the to <p>
    elements.

It's simply because, given the HTML schema you use, <body> may contain text, thus XXE considers whitespace to be "non-ignoreable" inside <body>.

FYI, HTML5 <body> (https://html.spec.whatwg.org/multipage/sections.html#the-body-element) may contain text while XHTML 1.0 Strict and XHTML 1.1 <body> (https://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd) may not contain text.

Therefore the behavior you describe does not exist when editing XHTML 1.0 Strict and XHTML 1.1 documents.




  *

    FEATURE REQUEST: Make sure that the document really does look the
    same. This can be done EITHER by performing *more* code
    reformatting, so that the "in between elements" whitespace is
    removed. Or by improving the XML rendering. I will assume that it
    would be easier to improve the reformatting of the code. In that
    regard, I would like to suggest an (improved) XML code minifying
    process - some kind of "code import process" which adapts the code
    to XXE's requirements.

Note, by the way, that in XXE it makes no difference whether one uses the "Semantic viewer" or the "Browser viewer". If, at least, the Browser viewer had been improved, then that would already be progress - it would be a way to improve the rendering.

Note, also, that of course the current behavior does not affect the semantics of the document. But it does hamper with the semantic view ... It sis irritating to have to see empty lines where there should be none.


Indeed. Sorry.




--
XMLmind XML Editor Support List
xmleditor-support@xmlmind.com
http://www.xmlmind.com/mailman/listinfo/xmleditor-support

Reply via email to