L. H. Silli wrote:
> 
> I was testing your product again. And in all the tests I made, I found 
> that it deletes the Byte-Order Mark (BOM). 

That's right.



> It also does not use the BOM 
> to interpret the encoding. 

XXE uses the Xerces XML parser and to my knowledge Xerces uses the BOM
to interpret the encoding. If this is not the case, then please report
this issue as a bug.



> If the encoding is US-ASCII or an 8-bit 
> legacy encoding, then it is a fatal error (per the XML 1.0 spec) to 
> have the BOM before teh XML declaration if the document is not UTF-8 
> encoded.

This never, ever, happens with files created by our product. If you have
found that this actually happens, then please report this issue as a bug.



> 
> I bring this up because, again, I'm interested in producing 
> HTML-compatible XHTML documents, and the BOM is the only way encoding 
> indicator that is works in both XML and HTML.

Sorry but we don't agree.

We are going next week to release v4.9.1 which allows to create
HTML-compatible XHTML documents and v4.9.1 does not use a BOM for that.

Excerpts from the release notes of v4.9.1:
---
XHTML files are now saved differently than in previous releases:

    * If you want to omit the XML declaration (that is, <?xml
version="1.x"...?>) from the save file, then add <meta
http-equiv="Content-Type" content="text/html;charset=UTF-8"/> to the
head element.

      For the XML declaration to be omitted, the media type must be
"text/html" and the charset must be "UTF-8".

      This is useful because both the XML declaration and the <!DOCTYPE>
declaration have an effect on the behavior of Web browsers. See
Activating Browser Modes with Doctype
(http://hsivonen.iki.fi/doctype/).

    * If you just want to force the encoding of a specific XHTML
document to be, for example, Windows-1250 without having to tweak
Options|Preferences, Save options, then add for example, <meta
http-equiv="Content-Type"
content="application/xhtml+xml;charset=Windows-1250"/> to the head element.
---



> 
> So, if you bring the XML editor in line with XML 1.0,
> you will also make more suitable to produce HTML.

To our knowledge, our XML editor is inline with XML 1.0 and XML 1.1. We
just have chosen not to add an UTF-8 BOM at the beginning of the XML
files encoded in UTF-8 that we create.

In the case of a document encoded using UTF-8, the BOM is never really
needed. When there is no <?xml encoding="XXX"?> or BOM, then the parser
automatically defaults to UTF-8.

Excerpts from Extensible Markup Language (XML) 1.0 (Fifth Edition)
http://www.w3.org/TR/xml/
---
In the absence of information provided by an external transport protocol
(e.g. HTTP or MIME), it is a fatal error for an entity including an
encoding declaration to be presented to the XML processor in an encoding
other than that named in the declaration, or for an entity which begins
with neither a Byte Order Mark nor an encoding declaration to use an
encoding other than UTF-8. Note that since ASCII is a subset of UTF-8,
ordinary ASCII entities do not strictly need an encoding declaration.
---

Now, we may be wrong and in such case, we would be grateful if you could
point us the the specification which mandates to add such UTF-8 BOM.




 
--
XMLmind XML Editor Support List
xmleditor-support@xmlmind.com
http://www.xmlmind.com/mailman/listinfo/xmleditor-support

Reply via email to