On Fri, Feb 07, 2025 at 09:19:28AM +0200, Eli Zaretskii wrote: > > Date: Thu, 6 Feb 2025 21:33:31 +0100 > > From: pertu...@free.fr > > Cc: bug-texinfo@gnu.org > > > > On Thu, Feb 06, 2025 at 07:53:06PM +0000, Werner LEMBERG wrote: > > > > > > OK, so let me recapitulate: If I want to support Texinfo manuals with > > > HTML output that could have both `@anchor{o}` and `@anchor{ö}`, and > > > there are other manuals that do cross references to these two anchors, > > > I must to use `--no-transliterate-file-names`, right? > > > > Right. Or set TRANSLITERATE_FILE_NAMES to 0 in init file. > > Really? Isn't texi2any supposed to handle file-name clashes in this > and other cases? I thought it did.
If I understand correctly, the issue with the clashes, such as that between Bogen and Bögen, only occurs with links from other manuals to anchors, which go via redirection pages. For internal links, which are the vast majority of links, there is no clash, as the link is generated directly to the correct location. It could be possible for @node's "Bogen" and "Bögen" to both be present in the same output file "Bogen.html", but links to these would be distinguished by the fragment identifier AFAIK (something like 'Bogen.html#Bogen' and 'Bogen.html#B_00f6gen' - I haven't checked). I hadn't appreciated this point, so it is not as urgent to change the output as I had thought. I think we should add an option to use UTF-8 file names, anyway, so that users can avoid the infelicities of dropping umlauts in German, changing "ö" to "o" in Swedish, etc. (It would be off by default, for the benefit of MS-Windows at least.) It would also be a way to avoid incompatibilities between Text::Unidecode and some '//TRANSLIT' iconv conversions. This would be much better than trying to add more options to specify exact transliteration rules in different contexts.