Hi Arnd,

On Thu, Jul 19, 2018 at 4:50 PM Arnd Bergmann <a...@arndb.de> wrote:
> On a related note, I've looked through all files in the kernel, and found
> that very file files in there are something other than 7-bit ASCII, UTF-8
> or non-text files (according to /usr/bin/file). These are the only ones I 
> found:
>
> Documentation/devicetree/bindings/net/nfc/pn544.txt: ISO-8859 text
> arch/arm/boot/dts/sun4i-a10-inet97fv2.dts:           C source, ISO-8859 text
> arch/arm/crypto/sha256_glue.c:                       C source, ISO-8859 text
> arch/arm/crypto/sha256_neon_glue.c:                  C source, ISO-8859 text
> arch/m68k/hp300/hp300map.map:                        ISO-8859 text
> arch/s390/kernel/ebcdic.c:                           C source, Non-ISO
> extended-ASCII text
> drivers/crypto/vmx/ghashp8-ppc.pl:                   a /usr/bin/env
> perl script, ISO-8859 text executable
> drivers/iio/dac/ltc2632.c:                           C source, ISO-8859 text
> drivers/power/reset/ltc2952-poweroff.c:              C source, ISO-8859 text
> drivers/staging/rtl8188eu/include/odm.h:             C source, ISO-8859 text
> drivers/tty/vt/defkeymap.map:                        ISO-8859 text
> kernel/events/callchain.c:                           C source, ISO-8859 text
> lib/fonts/font_7x14.c:                               data
> lib/fonts/font_8x16.c:                               data
> lib/fonts/font_8x8.c:                                data
> lib/fonts/font_pearl_8x8.c:                          data
> net/netfilter/ipvs/Kconfig:                          ISO-8859 text
> net/netfilter/ipvs/ip_vs_mh.c:                       C source, ISO-8859 text
> tools/power/cpupower/po/de.po:                       GNU gettext
> message catalogue, ISO-8859 text
> tools/power/cpupower/po/fr.po:                       GNU gettext
> message catalogue, ISO-8859 text
>
> Almost all of those can be trivially converted using 'recode 
> ISO-8859-1..UTF-8',
> which we should probably do. The four font files contain comments for each
> of the 256 characters, so that recode turns e.g. the <FF> character
> into <U+00FF>,
> which is probably still what we want here.
>
> The one exception seems to be arch/s390/kernel/ebcdic.c, which apparently
> uses 0x81 bytes as an excape before characters ISO-8859-1 characters with
> the high bit set. I don't know what that encoding is called, but I managed
> to manually convert it into something useful.

Yes, we should convert everything to UTF-8.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Reply via email to