On Sun, Dec 21, 2025 at 10:08:58PM +0000, Gavin Smith wrote:
> Here's a patch:
Here's a more complete patch. To avoid changing the output for
HTML, DocBook and one other other output format ("Texinfo XML"), when the
input was not UTF-8, I had to remove the default OUTPUT_ENCODING_NAME
UTF-8 setting. Otherwise these formats would be forced to UTF-8 as
well.
I'm slightly worried this may have some unintended effect (possibly on
some different setup from mine). For example, maybe OUTPUT_ENCODING_NAME
may end up as unset. I expect the output encoding should always be set
as it should be propagated from the input encoding, which should always
be set.
I don't think that the OUPTUT_ENCODING_NAME defaults did very much,
but I'm not certain. It's possible these default values stemmed from
a time before UTF-8 was the default input encoding for Texinfo. (For
example, "git blame" tracks the setting in DocBook.pm to a commit on
2012-09-14 (49aa00da6ae37), whereas UTF-8 only became the default input
encoding in 2019.)
diff --git a/ChangeLog b/ChangeLog
index b4185d5f7a..9f549d6155 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,24 @@
+2025-12-23 Gavin Smith <[email protected]>
+
+ UTF-8 by default for LaTeX output
+
+ * tta/perl/Texinfo/Convert/LaTeX.pm (%defaults):
+ Set OUTPUT_ENCODING_NAME to 'utf-8'.
+
+ * tta/perl/Texinfo/Common.pm (set_output_encoding),
+ * tta/C/main/document.c (set_output_encoding): Only propagate
+ encoding name from input encoding to output encoding if output
+ encoding is not already set.
+ * tta/perl/Texinfo/Convert/Text.pm: update comments
+
+ * tta/data/converters_defaults.txt (html_converter),
+ * tta/perl/Texinfo/Convert/DocBook.pm (%defaults),
+ * tta/perl/Texinfo/Convert/HTML.pm (%defaults),
+ * tta/perl/Texinfo/Convert/TexinfoXML.pm (%defaults):
+ Remove OUTPUT_ENCODING_NAME utf-8 default.
+
+ * NEWS: update
+
2025-12-23 Patrice Dumas <[email protected]>
* tta/C/convert/convert_html.c (html_conversion_finalization),
diff --git a/NEWS b/NEWS
index a709f21184..cddc981fd6 100644
--- a/NEWS
+++ b/NEWS
@@ -70,6 +70,9 @@ See the manual for detailed information.
. Info output:
. new (experimental) variable INFO_MATH_IMAGES allows outputting
images for mathematics notation
+ . LaTeX output:
+ . use UTF-8 encoding for output by default, regardless of input
+ encoding. override with OUTPUT_ENCODING_NAME.
. Remove the Texinfo::TeX4HT customization package.
. XML output:
. use HTML entities names for @H and @dotaccent accents types
diff --git a/tta/C/main/document.c b/tta/C/main/document.c
index f554ad555a..bea53f01a7 100644
--- a/tta/C/main/document.c
+++ b/tta/C/main/document.c
@@ -208,6 +208,7 @@ void
set_output_encoding (OPTIONS *customization_information, DOCUMENT *document)
{
if (customization_information
+ && !customization_information->OUTPUT_ENCODING_NAME.o.string
&& document && document->global_info.input_encoding_name) {
option_set_conf (&customization_information->OUTPUT_ENCODING_NAME, -1,
document->global_info.input_encoding_name);
diff --git a/tta/data/converters_defaults.txt b/tta/data/converters_defaults.txt
index bd7b4f8dae..29ae0e9439 100644
--- a/tta/data/converters_defaults.txt
+++ b/tta/data/converters_defaults.txt
@@ -114,7 +114,6 @@ NO_CSS 0
NO_NUMBER_FOOTNOTE_SYMBOL *
NODE_NAME_IN_MENU 1
OPEN_QUOTE_SYMBOL undef
-OUTPUT_ENCODING_NAME utf-8
SECTION_NAME_IN_TITLE 0
SHORT_TOC_LINK_TO_TOC 1
SHOW_TITLE undef
diff --git a/tta/perl/Texinfo/Common.pm b/tta/perl/Texinfo/Common.pm
index 4054ba4321..7387802a6a 100644
--- a/tta/perl/Texinfo/Common.pm
+++ b/tta/perl/Texinfo/Common.pm
@@ -1338,10 +1338,13 @@ sub set_output_encoding($$) {
if (defined($document)) {
$document_information = $document->global_information();
}
- $customization_information->set_conf('OUTPUT_ENCODING_NAME',
- $document_information->{'input_encoding_name'})
- if (defined($document_information)
- and exists($document_information->{'input_encoding_name'}));
+
+ if (!$customization_information->get_conf('OUTPUT_ENCODING_NAME')
+ and defined($document_information)
+ and exists($document_information->{'input_encoding_name'})) {
+ $customization_information->set_conf('OUTPUT_ENCODING_NAME',
+ $document_information->{'input_encoding_name'})
+ }
}
# $DOCUMENT is the parsed Texinfo document. It is optional, but it
diff --git a/tta/perl/Texinfo/Convert/DocBook.pm
b/tta/perl/Texinfo/Convert/DocBook.pm
index 604cca6678..49c838f31c 100644
--- a/tta/perl/Texinfo/Convert/DocBook.pm
+++ b/tta/perl/Texinfo/Convert/DocBook.pm
@@ -57,7 +57,6 @@ my %defaults = (
# Customization option variables
'FORMAT_MENU' => 'nomenu',
'EXTENSION' => 'xml', # dbk?
- 'OUTPUT_ENCODING_NAME' => 'utf-8',
'SPLIT' => '',
'OPEN_QUOTE_SYMBOL' => '&#'.hex('2018').';',
'CLOSE_QUOTE_SYMBOL' => '&#'.hex('2019').';',
diff --git a/tta/perl/Texinfo/Convert/LaTeX.pm
b/tta/perl/Texinfo/Convert/LaTeX.pm
index 3bfc247e39..3c8b0fecac 100644
--- a/tta/perl/Texinfo/Convert/LaTeX.pm
+++ b/tta/perl/Texinfo/Convert/LaTeX.pm
@@ -819,6 +819,7 @@ my %defaults = (
'FORMAT_MENU' => 'nomenu',
'EXTENSION' => 'tex',
'paragraphindent' => undef, # global default is for Info/Plaintext
+ 'OUTPUT_ENCODING_NAME' => 'utf-8'
);
diff --git a/tta/perl/Texinfo/Convert/TexinfoXML.pm
b/tta/perl/Texinfo/Convert/TexinfoXML.pm
index 3d99e26bbf..8040f5c741 100644
--- a/tta/perl/Texinfo/Convert/TexinfoXML.pm
+++ b/tta/perl/Texinfo/Convert/TexinfoXML.pm
@@ -45,7 +45,6 @@ my %defaults = (
# Customization option variables
'FORMAT_MENU' => 'menu',
'EXTENSION' => 'xml',
- 'OUTPUT_ENCODING_NAME' => 'utf-8',
'SPLIT' => '',
);
diff --git a/tta/perl/Texinfo/Convert/Text.pm b/tta/perl/Texinfo/Convert/Text.pm
index 7408985f8b..20462aa2a4 100644
--- a/tta/perl/Texinfo/Convert/Text.pm
+++ b/tta/perl/Texinfo/Convert/Text.pm
@@ -954,7 +954,7 @@ sub convert($$) {
if (defined($document)) {
$global_info = $document->global_information();
- # same as Texinfo::Common::set_output_encoding
+ # similar to Texinfo::Common::set_output_encoding
$self->{'OUTPUT_ENCODING_NAME'} = $global_info->{'input_encoding_name'}
if (defined($global_info)
and exists($global_info->{'input_encoding_name'}));
@@ -991,7 +991,7 @@ sub output($$) {
if ($document) {
$global_info = $document->global_information();
- # same as Texinfo::Common::set_output_encoding
+ # similar to Texinfo::Common::set_output_encoding
$self->{'OUTPUT_ENCODING_NAME'} = $global_info->{'input_encoding_name'}
if (defined($global_info)
and exists($global_info->{'input_encoding_name'}));