On 28 May 2017, at 9:32, Hussein Shafie wrote:
On 05/28/2017 01:43 AM, Leif Halvard Silli wrote:
By specifying <assembly xml:lang="fr">, I expeceted to get the Table
of
Contents heading of the realized document to be generated in French
(when converting to HTML).[1]
But that did not happen: It was generated in English. (For the
record:
Doing <assembly xml:lang="fr"> */does/* affect the language specified
by
xml:lang in the the output HTML document.)
I cannot reproduce this:
---
<assembly xml:lang="fr"> */does/* affect the language specified by
xml:lang in the the output HTML document
---
I mean, in my tests, <assembly xml:lang="fr"> has no effect whatsoever
on the generated document.
Ahem. You are right. My bad. The language was, it seems, picked up from
the topic files. When I removed the langauge from the topic files, I had
to specify the language on the <structure> in order to get it set in the
output file.
(When I talk about effect of placing xml:lang on <structure>, I refer to
[for XHTML5-outoput] the langauge being set on the topmost <section>
element because the XSL stylesheets default to placing xml:lang there.
Good practize is to place it on the root elmeent. But that is a separate
issue - which I raized, to in the DocBook XSL mailing list some weeks
ago.)
Only when I specified <structure xml:lang="fr">, did the Table of
Contents heading display in French.
Yes, that's right.
XML elements inherit the language. Thus this might be an assembly
processor bug.
It's not that clear.
Disagree, with regard to the specific issue at hand. See below.
The <assembly> is a document in itself: a specification (like a
makefile or an Ant build.xml file) having its own title, author and
language.
The title, author and language of a specification may be completely
unrelated to the title, author and language of the documents generated
using this specification.
I my view, this a somewhat theoretical claim. But even if it is
theoretical, it is already covered by the XML 1.0 specification. Though,
of course there, can be some legitimate minor issues to clarify, in the
assembly specification.
When I say that it is theoretical, it is based on the fact that, for the
most cases, it makes no sense that the realized document is of another
language thn the <assembly> element - see explantions below. And I am
not at all certain that it is helpful to compare with makefile and Ant
build.xml. It seems far more relevant to consider whether it makes sense
to specify one language for <html> and another for <body> - see below..
That's why I'm not sure that you have reported a bug.
I considered what you say.
But first: unless the child element <structure> has its own
xml:lang="foo" specification, it does (so says the XML spec) inherit the
language of the parent <assembly> element. Thus setting the language of
<assembly> is equal to also setting it on the child elements, such as
<structure> - [quoting XML
1.0](https://www.w3.org/TR/xml/#sec-lang-tag):
* «The language specified by xml:lang applies to the element where
it is specified (including the values of its attributes), and to all
elements in its content unless overridden with another instance of
xml:lang.»
Thus, to the extent that it is the language of <structure> that governs
the language of the (topmost element of the) realized document, then, in
case of <assembly xml:lang="fr">, it is, per the XML spec, unneccessary
to to specify xml:lang="fr" on <structure>.
And thus - and again: to the extent that it is the language of
<structure> that governs the language of the realized document (and
currently that is the way you implement it), it is a bug that the
assembly processor does not recognize that the language of <structure>
has already be set by <assembly xml:lang="fr">. I don’t spot any
loophole for any other interpretation.
Switching from what applications MUST do over to what authors MAY do
(which seems relevant to discuss, in view of what you said above):
The basic rule for language tagging is that you tag the root element
with the main or dominating content language of the document (there are
also tags for specing the language as 'multilingual', 'unknown' etc),
and thus, in any child element that deviates from what is specified in
root element, you tag it as an exception to what is specified in the
root element. In fact, this rule 'jumps out' (at least to me) as the
simple and logical good practize, given how xml:lang (and @lang in
HTML5) is defined.
That said: the spec permits us to switch language, at whim. Thus, if if
we look at HTML: it is perfectly legal to do <html lang="en"> and then
to do <body lang="fr">. However: while it certainly exists exceptions,
it almost never makes any sense to specify one language in the root and
then another language in the main child element.
And I see no difference with regard to <assembly> versus <structure>. It
fact, it could be said to make less sense in a DocBook assembly document
than in a HTML document. Why? Because, for HTML, then specifying the
language on <body> affects most of the public facing content anyway. And
while it is true that <structure> takes care of most of the public
facing content as well, the realized document picks up content not only
from <structure> but also from the <relationships> element. And so, if
you only specify the language on <structure> (and also: if the assembly
processor fails to notice that the language was already declared in
<assembly>), you must - as well - specify the language of the
<releationships> elements.
Example from the assembly document for «DocBook assemblies and topics
for the impatient» - if you only specify the language a <structure> you
MUST as well specify the language on <relationships> (if you care about
spellchecking, inside the assembly document, for instance):
<relationships xml:lang="nn">
<relationship>
<association>Sjå også</association>
<instance linkend="omittitles"/>
<instance linkend="contentonly"/>
</relationship>
<relationship>
<association>Sjå også</association>
<instance linkend="filtering"/>
<instance linkend="output"/>
</relationship>
</relationships>
The simple solution (and I hope that not my claim, but the very facts I
have - hopefully - pointed to, are convincing) is thus that one should
declare the language on the <assembly> element. That is the sensible
thing to do, for the common cases: Monolingual realized documents
without any sudden language switches. And there really isn’t anything
special with DocBook assemblies that should make us thing otherwise - in
this detail.
Also please note that the documentation of DocBook v5.1 assemblies is
currently pretty sketchy:
http://tdg.docbook.org/tdg/5.1/ch06.html
We expect to fix issues like the one you have reported once DocBook
v5.1 assemblies becomes better documented. (XMLmind has already filed
several bug reports signaling missing information in the documentation
of DocBook v5.1 assemblies.)
I can see that it is might be helpful for authors and developers that
the DocBook assembly spec says something about how language inheritance
is carried to work and how the relationship between the language of
<assembly> (and its elements), and the language of the realized document
and the language of the output from XSL conversion, is meant to work -
there are some nuances worth pointing out there, I suppose.
(For instance, what if you specify contentonly='true' on a module
element: This strips the root - or container - element from the 'pulled'
topic. What, then, if that container also had xml:lang="foo" set? In
what does the stripping of the wrapper element eventually also strip the
language information from the pulled content? [My guess is that this
would indeed strip the language completely from the pulled content -
unless the language was, as well, double specified by placing
xml:lang="foo" on the pulled child elements of the container element -
but I am not completely certain])
But most - or at least a fair amoint - of what it can be expected to say
is probably easily deducable from the XML spec and from the best
practizes that have long since been defined w.r.t. language tagging.
[1] FYI, I worked with the source code of “DocBook Assemblies and
Topics
for the Impatient
<http://www.xmlmind.com/tutorials/DocBookAssemblies/index.html>”.
--
leif halvard silli
--
XMLmind XML Editor Support List
xmleditor-support@xmlmind.com
http://www.xmlmind.com/mailman/listinfo/xmleditor-support