Title:

   RFE: Let automatic oneway conversion of named character entities
        *also* apply to documents without DTD

Alternative title of this letter:

   RFE: Please take advantage of XXE’s destructive treatment of
        named character entities.

Issue:

XXE fails to work seamlessly with documents without a DTD that define named character entities, when and if the document contains named character references.

Example:

   1. When trying to open the following well-formed document,
      <html xmlns="http://www.w3.org/1999/xhtml”>&hellip;</html>
   2. XXE shows an error message stating that “hellip” is undeclared.
   3. If you click the «OK» button, the file opens in XML source view,
      but the HTML namespace is not “bestowed” unto the document, which
      means that XXE does not allow you to work with the document in
      semantic WYSIWYG mode.

Consequences of current behavior:

   Your workflow is broken.

Wokrarounds:

   1. Alternative: You go through step 1 to 3 in the example, and apply
      some conversion tool ion the source code and reopen the file.

   2. Alternative: You fiddle with the other editor you use - if you
      have acccess to it and if that editor allows you to prefer
      directly typed text over named character entities.

   3. Alternative: You attach a DTD to the files you work with. The
      "HTML MathML entity set” would do the job:
      https://www.w3.org/TR/xml-entity-names/#htmlmathml

   4. XXE itself should have a (better) strategy for this!

      In particluar, for HTML documents, I do not get why XXE
      treats undeclared named character entities (so) different
      from declared named character references!

What should happen instead?

Proposal:

XXE should open such documents “normally” – in the specified namespace, probably/perhaps with a warning stating that named character entities has been converted to directly typed characters.

Justifications:

Even for documents *with* a DTD that define named character entities, XXE simply – without a warning – converts those entities to directly typed characters.[*] This "destructive" behavior with regard to named character entities, seems to break what the XXE documentation states in its XML source view documentation, see <https://www.xmlmind.com/xmleditor/_distrib/doc/help/xml_source_menu_item.html>, quote:

]] The source view of a document shows the contents of the save file which would be created by XXE for this document (same automatic indentation, same named character entities, etc). [[

Because, with regard to what happens to named character references, the above is not the case - such entities are simply silently converted to their directly typed character equivalent.

(Caveat: You, as user, can make XXE preserve named character entities, but then you must add “hellip” (and other named character entities that you want to keep) to the list of exceptions in XXE’s Save preferences - only then will &hellip; be saved and be displayed in XML Source view.)

However, XXE’s ”destructive” behavior is fully in style/line with XXE’s “destructive" treatment of source code. For example, XXE does not care about your nicely indented code, but instead inserts and removes non-semantic whitespace wherever it wants.

Finally, there is already an option in the Preferences to «Simulate a DTD» when there is no DTD, and it would certainly be in place, and make sense, and be in line with the HTML5 spec, to (at least with a warning) simulate that named character entities has been declared.

The finaly justification is of course in order to promote interoperability. There are just too many XML/XHTML/HTML editors and authors out there that insert various characters as named character entities. One of the most common issues, is of course the no-break space character, which so often occurs in source code as "&nbsp;”.

From time to time, there are complaints about XXE’s “destructive” treatment of source code. My claim is that, with regard to named character references, XXE would fit better in its common work flows if it would go all out in its “destructive” behavior.

Leif Halvard Silli

--
XMLmind XML Editor Support List
xmleditor-support@xmlmind.com
http://www.xmlmind.com/mailman/listinfo/xmleditor-support

Reply via email to