On Sun, Mar 02, 2025 at 11:43:37AM +0000, Gavin Smith wrote: > On Sun, Mar 02, 2025 at 12:27:49PM +0100, pertu...@free.fr wrote: > Could we look at extending the htmlxref.cnf format? > > As well as mono/chapter/section/node, like: > > GS = ${G}/software > hello mono ${GS}/hello/manual/hello.html > hello chapter ${GS}/hello/manual/html_chapter/ > hello section ${GS}/hello/manual/html_section/ > hello node ${GS}/hello/manual/html_node/ > > - there could be suffixed versions giving the transliteration status. > > It could be something like "node.translit" to give the location of > an online manual split by node, which nodes are named using transliteration: > > hello node.translit ${GS}/hello/manual/html_node/
Another option could be to consider that all the split possibilities of a manual have the same transliteration/link type option, and use another line like hello type translit ... emacs type utf8 and there would be the possibility to set also plain/default/expand to override a previous entry and reset to the default mymanual type default > If this is the line that is used for links to "hello", then any links > to that manual would have transliteration applied. > > This would allow only using transliteration for links to external > manuals that need it. This would remove the need to have something like TRANSLITERATE_EXTERNAL_FILE_NAMES and still cater for main types of use, but TRANSLITERATE_EXTERNAL_FILE_NAMES could still be relevant if a user wants to override the default for manuals that are not in htmlxref information. We could wait for users asking for it, though. > As below, we should always use Text::Unidecode for transliteration > if possible. > > > Date: Mon, 10 Feb 2025 15:11:03 +0100 > > From: pertu...@free.fr > > To: Werner LEMBERG <w...@gnu.org> > > Cc: gavinsmith0...@gmail.com, bug-texinfo@gnu.org > > Subject: Re: normalization problem with `@anchor` targets > > > > Note that the transliteration may also be different in tests and in > > regular output, to get reproducible output. If C is used, for instance, > > iconv //TRANSLIT is used in output (which is actually a risk for > > reproducible cross manuals references), while Text::Unidecode or > > Text::Unidecode compatible transliterations are used in tests. > > If in future we allow non-ASCII characters in output HTML file names, we > could also have "node.utf8". > > For completeness, there should also be a name for the current default > - maybe something like "node.plain" or "node.expand" (referencing > the "HTML Xref Node Name Expansion" spec).