If I had ever learned how "mixed mode" affects white space handling in
XML (I probably did learn it 5-6 years ago), then at least I had
forgotten it ..
Thanks for the advice to modify the DTD. I guess will do that - if you
don't do it. Because:
XMLmind has made a few willful violations of the HTML5 spec, such as
permitting the border attribute for the table element. Amd I think it
would be very meaningful , perhaps more meaningful than allowing the
border attribute for table, to limit the content model of <body>,
<section> and perhaps some other HTML block elements as well, to a model
where text nodes are unpermitted. And why not do the same for the DITA
<section> element.
Remember, as well, that the current behavior causes aiuthors to -
unwillingly - commit errors in the form of text nodes as direct children
of <body> and <section>.
Leif Halvard Silli
On 31 Dec 2020, at 10:25, Hussein Shafie wrote:
On 12/31/20 1:44 AM, Leif Halvard Silli wrote:
An interesting problem ...
To solve the issue from our point of view, I have invested in a XML
minifier. Why could not XXE do something similar?
Anyway, just some questions, for understanding and verification
of your explanation:
The <ol> and <ul> elements might, per HTML5, contain whitespace and
comments and even, I think, template elements and script elements.
This in addition to the obligatory <li> elements. Note that I am now
talking about the code level and not rendering level.
Such whitespace is not to be rendered, though. And, lo and behold,
XXE never renders empty lines inside <ol> or <ul>. So all is good.
How is this different from the <body> element?
The content model of <ol> and <ul> found in our in-house HTML5 W3C XML
Schema is:
---
((li | script | template))*
---
That is, TEXT not allowed. See attached screenshot.
OTOH, the content model of <body> found in our in-house HTML5 W3C XML
Schema is:
---
Element body can contain TEXT.
((em | strong | small | s | cite |
q | dfn | abbr | ruby | data |
time | code | var | samp | kbd |
sub | sup | i | b | u |
mark | bdi | bdo | span | br |
wbr | mml:math | svg:svg | picture | img |
iframe | embed | area | label | input |
button | select | datalist | textarea | output |
progress | meter | link | script | template |
[1] | address | p | hr | pre |
blockquote | ol | ul | menu | dl |
figure | div | a#2 | ins#2 | del#2 |
object#2 | video#2 | audio#2 | map#2 | table |
form | fieldset | details | dialog | noscript#2 |
slot#2 | canvas#2 | article | section | nav |
aside | h1 | h2 | h3 | h4 |
h5 | h6 | hgroup | header | footer |
main))*
---
You can check this by yourself simply by selecting an element and then
choosing menu item "Help|Show Content Model"
(http://www.xmlmind.com/xmleditor/_distrib/doc/help/helpMenu.html)
If this really is different, why not switch to the XHTML1.x behavior?
XXE is an editor and not a xHTML renderer. A conscious break with
xHTML5, if necessary. Text nodes directly as children of <body> is
anyhow something to avoid. Especially in the kind of documents for
which XXE is an excellent writing tool.
In fact, any time XXE permits me to write something like the
following, it is - from my point of view - just an accident and a
confusing pain in the ass - code example:
<body>
<p>Para 1.</p>
Para 2.
<p>Para 3.</p>
</body>
Note that you can also achieve a similar "mess" with the stock DITA
DTD. For example, a <section> may contain TEXT in addition to
"blocks".
If it's OK for the schema, then it's allowed by XXE. It's as simple as
that.
In order to solve your issue, I would recommend using a customized
HTML5 schema (a very simple *strict* subset) rather than our stock
HTML5 schema.
--
XMLmind XML Editor Support List
xmleditor-support@xmlmind.com
http://www.xmlmind.com/mailman/listinfo/xmleditor-support
--
XMLmind XML Editor Support List
xmleditor-support@xmlmind.com
http://www.xmlmind.com/mailman/listinfo/xmleditor-support