On 19/11/2024 17:34, Bruno Haible wrote:
Pádraig Brady wrote:
I would prefer to bypass the ASCII case if CODE >= 0 && CODE < 128.
However is that generally correct?

Yes, at least for CODE >= 32 && CODE < 128 it is correct.
This can be seen from the list of supported locale encodings in
gnulib/lib/localcharset.h.

OK I've adjusted our test to use \u00032 instead,
and tested the attached code, which I'll push later.

thanks,
Pádraig
From e3aa40bba3e1cabd6bcc781eaacc714a840fbc52 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= <p...@draigbrady.com>
Date: Tue, 19 Nov 2024 18:11:21 +0000
Subject: [PATCH] unicodeio: avoid iconv issues for mose ASCII characters

* lib/unicodeio.c (print_unicode_char): Avoid unicode_to_mb()
for most ASCII characters, to avoid iconv() issues
which were seen on macOS.
Addresses https://bugs.gnu.org/74428
---
 ChangeLog       |  8 ++++++++
 lib/unicodeio.c | 18 +++++++++++++-----
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index bf5f6381c1..c4b2290d6c 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2024-11-19  Pádraig Brady  <p...@draigbrady.com>
+
+	unicodeio: avoid iconv issues for most ASCII characters
+	* lib/unicodeio.c (print_unicode_char): Avoid unicode_to_mb()
+	for mose ASCII characters, to avoid iconv() issues
+	which were seen on macOS.
+	Addresses https://bugs.gnu.org/74428
+
 2024-11-19  Bruno Haible  <br...@clisp.org>
 
 	stdlib: Adjust warning about function 'free'.
diff --git a/lib/unicodeio.c b/lib/unicodeio.c
index e93f86ecba..ca23880f61 100644
--- a/lib/unicodeio.c
+++ b/lib/unicodeio.c
@@ -217,9 +217,17 @@ fallback_failure_callback (unsigned int code,
 void
 print_unicode_char (FILE *stream, unsigned int code, int exit_on_error)
 {
-  unicode_to_mb (code, fwrite_success_callback,
-                 exit_on_error
-                 ? exit_failure_callback
-                 : fallback_failure_callback,
-                 stream);
+  /* Simplifiy this subset to avoid potential iconv() issues
+     for that range at least.  */
+  if (32 <= code && code < 128)
+    {
+      char code_char = code;
+      fwrite_success_callback (&code_char, sizeof code_char, stream);
+    }
+  else
+    unicode_to_mb (code, fwrite_success_callback,
+                   exit_on_error
+                   ? exit_failure_callback
+                   : fallback_failure_callback,
+                   stream);
 }
-- 
2.47.0

Reply via email to