index entries

Eli Zaretskii Sat, 20 Aug 2022 00:52:59 -0700

> Date: Sat, 20 Aug 2022 09:39:26 +0200
> From: Patrice Dumas <[email protected]>
> Cc: [email protected]
> 
> > As for decoding the document, given that we have the @documentencoding
> > directive, which could specify any encoding whatsoever, the Info
> > reader should use the encoding specified for the document.  This is
> > already fixed for the Emacs reader, which uses the 'coding:' cookie at
> > the end of the Info file, so the simplest thing for the stand-alone
> > reader is to use the same.
> 
> The stand-alone reader already does that for regular Info browsing.
> I tested both in a 8bit encoded locale and in an UTF-8 locale,
> reading Info files in iso-8859-1 and utf-8 encodings, and it works
> well, including searching.  In the 8bit locale, the UTF-8 characters
> appear as ??? but that's the best possible output.


Ideally, ??? should only appear if the character cannot be encoded in
the locale's codeset.  Otherwise, the reader should encode in the
locale's codeset before writing.  So, for example, Latin-1 characters
in a UTF-8 encoded document should appear as themselves if the
locale's encoding is Latin-1.

(Btw, if we use libiconv for encoding, there's the //TRANSLIT
qualifier, which could handle some characters that are technically not
encodable.)

> There is a specific issue with --apropos, I guess.

Why is it special?

> > When outputting to the terminal, the reader should indeed use the
> > locale's encoding.
> 
> The standalone Info reader always output to the terminal...

Yes.  That's why I said "when", not "if" ;-)

Re: info --apropos should decode/encode nodes/index entries

Reply via email to