Great! Made my day hearing that!
Leif
Den 2023-05-04 12:33 skreiv Hussein Shafie:
We'll try to implement in the next version of XXE what you call "the
third alternative" in the email below and explain in greater details
in your following email.
We'll do this even if the XInclude standard completely ignores the
lang attribute. It's either doing this or tolerating having an
inconsistent XInclude implementation when used in the context of
(X)HTML documents.
On 5/4/23 00:39, Leif H Silli wrote:
XMLmind XML Editor (XXE) is an - eh - XML editor. But as XHTML is
typically consumed as text/HTML, in practise it is also a HTML editor.
XHTML has always had a 'complicated' relationship to xml:lang. It
seems like everyone really wants to just use lang - or at least use
just one attribute (which would have had to be lang). But - either -
in order to "look good" - or - in order to use XML tooling (which is
rumoured to not support lang), it is customary to put upon oneselves
the burdon of adding both lang and xml:lang. This was recommended back
in 1998 when XHTML 1 was released. And it was also recommended by the
polyglot HTML draft (even if, in my heart, I did not want to require
both attributes).
Context: As for tooling, then we see this in XXE itself; When working
with XInclude, it turns out that @lang is ignored - only xml:lang is
respected and counted. (I might write a separate RFE or BUG about
that.) The consequence being that if, via XInclude, you embed
<section lang="en" id="sect01" />
then the way XXE implements what XInclude calls 'language fixup'
causes the above element to be embedded as if the language property is
unknown, which (as of XXE 10.4) means that XXE adds xml:lang="" to the
embedded element:
<section xml:lang="" id="sect01" />
And, if it does not delete the lang attribute, we even get this:
<section xml:lang="" lang="en" id="sect01" />
Which means that XInclude has created a document which is invalid,
since, when both xml:lang and lang are used, they must be in
agreement. Clearly, this behavior is wrong - on many levels. (A
separate bug about this will probably be written.)
However, this message is not about how XInclude is implemented, but
about making it more convenient to apply xml:lang in XHTML and HTML
documents. And the (current) need to use xml:lang warrants that it
should be simpler.
I see two ways to make it simpler: EITHER add some form of automation:
When someone adds or edits the lang attribute, then xml:langs is added
and/or edited, automatically, in parallell. OR offer xml:lang in the
default list of attributes to select from. (Clearly I prefer the
automated variant.) AND a third option: Decrease the need to use
xml:lang.
So as of today, when authoring HTML or XHTML docuemnts, the lang
attribute is by default visible inside the Attribute editor. Just
click on the attribute name, and add the value. Whereas for xml:lang,
you must either manually type the name of the attribute before you can
select it, or you can change the defaults (on the fly) so that so
called xml attributes are also visible. (But this also makes xml:base
and xml:space visible. )
But the third alternative is what I prefer the most: Decrease the need
to use xml:lang. For instance, by changing the implementation of
XInclude so that lang is treated like xml:lang (and/or so that
xml:lang is kept in sync with lang).
On 5/4/23 01:36, Leif H Silli wrote:
The relase notes for XXE 10.4 refers to language fixup in XInclude
1.1. I must first start be exolaining why we should not give too much
heed to what XInclude 1.0 or 1.1 says about language properties.
XInclude need an update. XInclude 1.1 is a Working Groupn Note from
2016 [1], while the final version of a Recommended spec, XInclude 1.0,
is from 2006 [2].
The Note from 2016 includes some innovations such as set-xml-id
(though it should probably also have had a set-id attribute as well).
But at the same time, when it comes to language properties, the spec
that it references, IETF RFC 3066, published in 2001, was outdated
when the Note was published: The current best practise for language
tagging, was specified in 2009 - seven years before the Note was
finished [3].
The work on HTML5 begun around 2006, when the first XInclude was
published. In HTML4, the lang attribute behaved different from the
xml:lang attribute. But in HTML5, which implements BCP 47, the
specification of lang has been 'updated' so that lang and xml:lang
work the same (the only difference being that xml:lang only works when
consumed as XML).
So the Working Group Note from 2016 does not pick up all the changes
that happened to HTML and language tagging since 2006. Perhaps that is
the reason why XInclude only talks about xml:lang and not about lang?
XInclude seems to have been created in the spirit of XHTML 1.0, when
the attitude was that we will soon kill text/html. And so, for
example, XInclude 1.1’s section on Language Fixup from 2016, is
identical with XInclude 1.0’s section on Language Fixup from 2006.
Instead, we have ended up with situation where we try to keep HTML as
XML and HTML as text/html as much as possible in sync. In sync, but
different.
It does therefore not make sense anymore that XInclude only considers
xml:lang and ignore lang.
XMLmind XML Editor version 10.,4 is an exmple of this. Per the relase
notes [4], XXE 10.4 “Made the language fixup of the XInclude 1.1
implementation more conforming to the specification.“. (As I mentioned
above, the language fixup of XInclude 1.1 [5] is identical with the
language fixup of XInclude 1.0 [6], so - sorry to say it, but - the
reference to XInclude 1.1 here, simply gives appearans of being an up
to date reference.)
So what is the change in 'language fixup' that has been added in XXE
10.4? Here is an example:
When working with XInclude, it turns out that XXE ignores @lang - only
xml:lang is respected and counted. The consequence being that if, via
XInclude, you embed into another element the following element,
<section lang="en" id="sect01" />
then the way XXE implements 'language fixup' causes the above element
to be embedded as if the language property is unknown, which (as of
XXE 10.4) means that XXE adds xml:lang="" to the embedded element:
<section xml:lang="" id="sect01" />
And, if it does not delete the lang attribute, we even get this:
<section xml:lang="" lang="en" id="sect01" />
(Sometimes the @lang is deleted, other times it is not, I am not yet
certain about when what happens - but both things are meaningless.)
Which means that XXE 10.4’s implementation of XInclude has created a
document which is invalid, since, when both xml:lang and lang are
used, they must be in agreement. Also, it has failed to take the
lang="en" attribute into account, thus loosing information. Further
more, if the end result - the resulting document of the xinclusion -
is meant for consumption by text/HTML consumers, then text/HTML
consumers do not understand the xml:lang attribute etc.
Solution: The solution is to treat @lang and xml:lang equally. Thus in
the example above, the result would have become this:
<section lang="en" id="sect01" />
Or (if you want to consider that not all Xinclude processors - if
anyone at all - handle the lang attribnuite) this:
<section lang="en" xml:lang="en" id="sect01" />
[1] https://www.w3.org/TR/xinclude-11/
[2] https://www.w3.org/TR/xinclude/
[3] https://www.ietf.org/rfc/bcp/bcp47.txt
[4] https://xmlmind.com/xmleditor/changes.html#v10.4.0
[5] https://www.w3.org/TR/xinclude-11/#language
[6] https://www.w3.org/TR/xinclude/#language
--
XMLmind XML Editor Support List
xmleditor-support@xmlmind.com
http://www.xmlmind.com/mailman/listinfo/xmleditor-support
--
XMLmind XML Editor Support List
xmleditor-support@xmlmind.com
http://www.xmlmind.com/mailman/listinfo/xmleditor-support