Shin Kojima <s...@kojima.org> writes:

> Some multi-byte character encodings (such as Shift_JIS and GBK) have
> characters whose final bytes is an ASCII '\' (0x5c), and they
> will be displayed as funny-characters even if $fallback_encoding is
> correct.  This is because `highlight` command always expects UTF-8
> encoded strings from STDIN.
>
>     $ echo 'my $v = "申";' | highlight --syntax perl | w3m -T text/html -dump
>     my $v = "申";
>
>     $ echo 'my $v = "申";' | iconv -f UTF-8 -t Shift_JIS | highlight \
>         --syntax perl | iconv -f Shift_JIS -t UTF-8 | w3m -T text/html -dump
>
>     iconv: (stdin):9:135: cannot convert
>     my $v = "
>
> This patch prepare git blob objects to be encoded into UTF-8 before
> highlighting in the manner of `to_utf8` subroutine.
> ---

The single liner Perl invoked from the script felt a bit too dense
to my taste but other than that I have no complaints to what the
patched code does.

Jakub, does it look good to you, too?

Please sign-off your patch (see Documentation/SubmittingPatches).

Thanks.


>  gitweb/gitweb.perl | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 05d7910..2fddf75 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -3935,6 +3935,9 @@ sub run_highlighter {
>  
>       close $fd;
>       open $fd, quote_command(git_cmd(), "cat-file", "blob", $hash)." | ".
> +               quote_command($^X, '-CO', '-MEncode=decode,FB_DEFAULT', 
> '-pse',
> +                 '$_ = decode($fe, $_, FB_DEFAULT) if !utf8::decode($_);',
> +                 '--', "-fe=$fallback_encoding")." | ".
>                 quote_command($highlight_bin).
>                 " --replace-tabs=8 --fragment --syntax $syntax |"
>               or die_error(500, "Couldn't open file or run syntax 
> highlighter");
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to