On Fri, Feb 07, 2025 at 09:19:28AM +0200, Eli Zaretskii wrote:
> > Date: Thu, 6 Feb 2025 21:33:31 +0100
> > From: pertu...@free.fr
> > Cc: bug-texinfo@gnu.org
> > 
> > On Thu, Feb 06, 2025 at 07:53:06PM +0000, Werner LEMBERG wrote:
> > > 
> > > OK, so let me recapitulate: If I want to support Texinfo manuals with
> > > HTML output that could have both `@anchor{o}` and `@anchor{ö}`, and
> > > there are other manuals that do cross references to these two anchors,
> > > I must to use `--no-transliterate-file-names`, right?
> > 
> > Right.  Or set TRANSLITERATE_FILE_NAMES to 0 in init file.
> 
> Really?  Isn't texi2any supposed to handle file-name clashes in this
> and other cases?  I thought it did.

If I understand correctly, the issue with the clashes, such as that
between Bogen and Bögen, only occurs with links from other manuals to
anchors, which go via redirection pages.  For internal links, which are
the vast majority of links, there is no clash, as the link is generated
directly to the correct location.  It could be possible for @node's
"Bogen" and "Bögen" to both be present in the same output file
"Bogen.html", but links to these would be distinguished by the fragment
identifier AFAIK (something like 'Bogen.html#Bogen' and 'Bogen.html#B_00f6gen'
- I haven't checked).

I hadn't appreciated this point, so it is not as urgent to change the
output as I had thought.

I think we should add an option to use UTF-8 file names, anyway, so
that users can avoid the infelicities of dropping umlauts in German,
changing "ö" to "o" in Swedish, etc.  (It would be off by default, for the
benefit of MS-Windows at least.)

It would also be a way to avoid incompatibilities between Text::Unidecode
and some '//TRANSLIT' iconv conversions.  This would be much better
than trying to add more options to specify exact transliteration rules
in different contexts.

Reply via email to