On Wed, Dec 24, 2025 at 12:23:08AM +0000, Gavin Smith wrote:
> Yes, if the input encoding was ISO-8859-1, the output encoding for DocBook
> would be ISO-8859-1.  OUTPUT_ENCODING_NAME would be set to "utf-8" from
> %defaults, but this would then be immediately overridden by the input
> encoding when set_document is called.  That's how it works currently: it's
> possible it has worked differently in the past.

I do not think so, I tesed with texinfo 5 and it is the same.

> We could prefer UTF-8 for DocBook and HTML output too.  There doesn't
> seem to be a strong reason to prefer the input encoding.  (Users could
> still override the output encoding by setting OUTPUT_ENCODING_NAME on the
> command line.)  There's just a small chance that users are processing input
> files for some niche case where they want to preserve the input encoding.
> But it doesn't seem as necessary as the output is not actually broken for
> these output formats.  I think we should change it only if there are problems
> (e.g. if DocBook processors refuse to process non-UTF-8 input).

To me there are two use cases.
* Users could prefer an encoding for the manual, it makes sense to use this
  encoding for output formats too.
* There are legacy manuals in encodings that were used before but are
  superceded by UTF-8.  In that case it would be better to have UTF-8
  as output encoding independently of the input encoding.  The user
  may not even have the tools to process files in this other encoding.

> > It is not clear to me what the best interface could be.  We could
> > imagine using something similar to the file names encoding, ie have a
> > variable like
> >   DOC_ENCODING_FOR_OUTPUT_ENCODING_NAME
> > and if it is set to 0, the default OUTPUT_ENCODING_NAME would be left as
> > is.
> > 
> > But your approach is ok too.
> 
> I don't see the need for a new variable for this.  Users can already set
> OUTPUT_ENCODING_NAME if they need to.

It is not exactly the same.  A variable could allow to specify one of
the use cases above in a more generic way.  However, given that the
output encoding that would be used is UTF-8 for all the formats that
could specify it and is also the default encoding.  And also given that
there is not much point with using anything else than UTF-8 and that it
is pretty easy to change the encoding of a manual, I agree that it is
not needed to add another variable.

-- 
Pat

Reply via email to