On 19/11/2024 16:08, Bruno Haible wrote:
Pádraig Brady wrote:
On your macos 15 system, iconv() of 0x30 is failing to convert from utf8 to C,
and the fallback in unicodeio.c is outputting the \u0030.
Now I don't have access to macos to see exactly why that iconv() is failing,
The macOS iconv() is unusable, since macOS 12. [1][2]
A user needs to install GNU libiconv there, if they want programs to behave
normally. As documented in gnulib/DEPENDENCIES.
You may experiment with some patch to unicodeio.c, based on [2].
Or how about bypassing unicode_to_mb entirely if CODE >= 0 && CODE < 128 ?
unicode_to_mb was designed for specific characters like the Copyright sign
or the quotation marks. For plain ASCII characters you can bypass it.
Bruno
[1] https://lists.gnu.org/archive/html/bug-gnulib/2024-02/msg00123.html
[2] https://lists.gnu.org/archive/html/bug-gnulib/2024-02/msg00217.html
I would prefer to bypass the ASCII case if CODE >= 0 && CODE < 128.
However is that generally correct? Consider EBCDIC for example, with something
like:
LC_ALL=EBCDIC-US env printf '\u0030\n'
Now that's not supported anyway on my Fedora 40 system at least,
but perhaps it might be in some setups?
thanks,
Pádraig