On Sat, Sep 13, 2025 at 11:28:56PM +0100, Gavin Smith wrote:
> On Sat, Sep 13, 2025 at 05:48:30PM +0100, Gavin Smith wrote:
> > * The language name mapping is extremely rudimentary:
> >
> > my %highlight_type_languages_name_mappings = (
> > 'source-highlight' => {
> > 'C++' => 'C',
> > 'Perl' => 'perl',
> > },
> > 'highlight' => {
> > 'C++' => 'c++',
> > },
> > 'pygments' => {
> > 'C++' => 'c++',
> > }
> > );
> >
> > Is this useful or necessary for us to maintain on a program-by-program
> > basis?
>
> Here's what I propose about "language name mapping". There are two
> possibilities:
>
> * Basic: The argument on the @example line (or value of
> HIGHLIGHT_SYNTAX_DEFAULT_LANGUAGE) is used directly in the call to the
> syntax highlighting program. This would require the user changing e.g.
> "@example C++" to "@example c++" or "@example C" - not a big deal at all.
>
> * Advanced: If that is not enough: the user has to create their own
> wrapper script which could process the language names.
I would have preferred if it could work out of the box, but I agree that
having a language name mapping that is the same for all the users may
not be possible.
> The user might need to create their own wrapper script handling language
> names,
> anyway. In pygments, "lexers" (what we are calling language names) have
> "options", of which there are many:
>
> https://pygments.org/docs/lexers/#
>
> If they want to provide different options for different languages, then this
> information would have to be in their wrapper script.
>
> Generally, there could be many language-specific options that the user
> might want to provide and there is no point for us to try to provide defaults.
The default is:
pygmentize -f html -O noclasses=True
It is generic enough, I believe. My feeling is that it covers most use
cases, although using a wrapper script would also be ok.
> This then has implications for the "checks on languages" - checking that
> the language is recognised by the highlighter program. I have only just
> understood that highlight_syntax.pm does this (the 'highlight_setup' function
> was just a bunch of code I didn't really understand). If the wrapper
> script does its own conversion of language names there can't be any
> error checking on the highlight_syntax.pm side. It's up to the wrapper
> script to make sure it invokes the highlighting program with the correct
> language (lexer) name (and any other options are correct too). Again, this
> does not seem like a problem. We could capture any error output by
> HIGHLIGHT_SYNTAX (or HIGHLIGHT_SYNTAX_PROGRAM or similar variable) so
> that the reason for the error is apparent to the user, and not highlight
> the output if the highlighter program (or wrapper script) exits
> unsuccessfully.
That is already what is done, if the HIGHLIGHT_SYNTAX value is not
"highlight", "pygments" nor "source-highlight".
In that case, the user has to do the language analysis herself and
reject or map languages, which is ok for an advanced use, but which I
would have liked to avoid for a basic use.
--
Pat