Re: From wchar_t to char32_t

2023-07-03 Thread Paul Eggert
On 2023-07-03 15:00, Bruno Haible wrote: Level 3: Behave correctly. Don't split a 2-Unicode-character sequence. This is what code that uses mbrtoc32() does, when it has the lines if (bytes == (size_t) -3) bytes = 0; and us

Re: libunistring v1.1 : 22 errors during `make check`

2023-07-03 Thread Bruno Haible
Russell Warren wrote: > > In fact, libunistring relies on the iconv API (in libc or libiconv), not on > > the 'iconv' program. > > > > That is what I thought as well, so I had deleted the iconv executable in my > build script. This caused the 22 failures. The only change made between 22 > failure

Re: From wchar_t to char32_t

2023-07-03 Thread Bruno Haible
Paul Eggert wrote: > The complication would be needed because diffutils is trying to count > columns as it goes, and in some cases it needs to stop when a column > count has reached a maximum. It's not two lines of code. Indeed. I need to check the mbiter and mbuiter modules, since they do somet

Re: libunistring v1.1 : 22 errors during `make check`

2023-07-03 Thread Russell Warren
On Sun, Jul 2, 2023, 7:55 AM Bruno Haible wrote: > In fact, libunistring relies on the iconv API (in libc or libiconv), not on > the 'iconv' program. > That is what I thought as well, so I had deleted the iconv executable in my build script. This caused the 22 failures. The only change made bet

proposed performance tweaks to Gnulib mbchar module

2023-07-03 Thread Paul Eggert
Attached are two proposed performance tweaks I found by inspection. No big deal of course.From 775a34de03f0c4cc9a8a87e65030d19733301193 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Mon, 3 Jul 2023 10:54:36 -0700 Subject: [PATCH 1/2] mbchar: treat @, $, ` as basic The C standard says that @,

Re: From wchar_t to char32_t

2023-07-03 Thread Paul Eggert
Come to think of it this (size_t) -3 issue with mbrtoc32 is probably worth documenting. I installed the attached to give it a shot.From e046d5458353f112e78893ca03d855c8a9aa2e39 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Mon, 3 Jul 2023 10:24:05 -0700 Subject: [PATCH] mbrtoc32: document (siz

Re: From wchar_t to char32_t

2023-07-03 Thread Paul Eggert
On 2023-07-02 13:18, Bruno Haible wrote: If (size_t) -3 is possible, I suppose I should change diffutils to take this into account, as bleeding-edge diffutils/src/side.c treats (size_t) -3 as meaning the next input byte is an encoding error, which is obviously wrong. If you want the diffutils co