RE: lib/fnmatch.c: mistakes `char32_t` rather than `wchar_t` on Windows

2024-07-12 Thread YX Hao
Hi Bruno,

> The dependencies to these modules (unistr/u32-strlen and unistr/u32-pcpy) are 
> declared in modules/fnmatch, as well as the link requirements
> Could it be that

> This workaround of yours disables the support of Unicode characters outside 
> the BMP (used by Chinese, Emoji, and many other scripts) in fnmatch.
> That support is only present through char32_t. That's actually the point of 
> using char32_t. [1]
> [1] https://www.gnu.org/software/gnulib/manual/html_node/Characters.html

Thank you for sharing how the components cooporate!
It help a lot to understand gnulib.

It seems Windows just use `wchar_t` for Unicode:
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/strlen-wcslen-mbslen-mbslen-l-mbstrlen-mbstrlen-l
https://learn.microsoft.com/en-us/windows/win32/Intl/unicode-and-character-set-functions

I thought gnulib follows Windows behavior on Windows. In fact, in `fnmatch`,
you convert characters to UTF32. Thus using `char32_t`.

```c
#include 
#include 
#include 

void main(void){
wchar_t *u16le = L"\uD83D\uDE00"; // 😀
printf("%d\n", wcslen(u16le));
}
```
and
```js
console.log('😀'==='\uD83D\uDE00');
console.log('😀'.length);
```
yield 2. It is a bit weird.


For the linking errors:
Only the `fuzz` linking fails, but `build` and `tests` succeed,
when shared building but not static. 4 months ago and 2 days ago.
Today, when I try to download the `.a` files and `Makefile` to do more
research, all succeeded, without hacking, the same gcc version.
This is really weird!

Let it be, until it's reproduced.


Thank you very much!


Best Regards,
YX Hao





Re: Integer overflows in memchr

2024-07-12 Thread Paul Eggert

On 6/30/24 13:14, Po Lu wrote:

I think there should be a trivial test for a functional strnlen in
strnlen.m4, since it would be terrible to duplicate what ought to be the
responsibility of Gnulib in Emacs's configure.ac.


Here's a first cut at doing that, as a patch to Emacs master that I have 
not installed. Could you please give it a try? If it works I can migrate 
it into Gnulib and Autoconf.


The new code in strnlen.m4 isn't quite the same as the Emacs 
configure.ac code it replaces, but I hope it's good enough. I should be 
more portable as it doesn't depend on Emacs configure.ac vars.
From 2e8c14206b1a8e3f8bcf34639b3d65861741bd64 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 12 Jul 2024 17:23:30 +0100
Subject: [PATCH] Improve strnlen bug detection

The workaround for the Android 5.0 (API 21) strnlen bug
is now done in a different way in m4/strnlen.m4, with the idea of
migrating that into Gnulib and then to Autoconf.
* configure.ac (ORIGINAL_AC_FUNC_STRNLEN, AC_FUNC_STRNLEN):
Remove.
* m4/strnlen.m4 (AC_FUNC_STRNLEN): Replace if Autoconf 2.72 or
earlier, with code that detects the Android problem with strnlen.
This version works around some further bugs in the test, notably,
misplaced 'volatile' and need for volatile in the AIX 4.3 bug
check too.
---
 configure.ac  | 24 
 m4/strnlen.m4 | 51 ++-
 2 files changed, 50 insertions(+), 25 deletions(-)

diff --git a/configure.ac b/configure.ac
index 54c46151bd5..b53224e764d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1611,30 +1611,6 @@ AC_DEFUN
   [HAVE_OFF64_T=1
AC_SUBST([HAVE_OFF64_T])])
 
-# `strnlen' cannot accept nlen greater than the size of the object S
-# on Android 5.0 and earlier.
-m4_define([ORIGINAL_AC_FUNC_STRNLEN], m4_defn([AC_FUNC_STRNLEN]))
-AC_DEFUN([AC_FUNC_STRNLEN], [
-AC_REQUIRE([AC_USE_SYSTEM_EXTENSIONS])dnl
-AC_REQUIRE([AC_CANONICAL_HOST])dnl
-AC_CACHE_CHECK([for strnlen capable of accepting large limits],
-  [emacs_cv_func_strnlen_working],
-  [AC_RUN_IFELSE([AC_LANG_PROGRAM([AC_INCLUDES_DEFAULT], [[
-  volatile size_t (*strnlen_pointer) (const char *s, size_t) = &strnlen;
-  if ((*strnlen_pointer) ("", -1) != 0)
-return 1;
-  return 0;
-]])],[emacs_cv_func_strnlen_working=yes],
- [emacs_cv_func_strnlen_working=no],
- [# Guess no on Android 21 and earlier, yes elsewhere.
-  AS_IF([test -n "$ANDROID_SDK" && test "$ANDROID_SDK" -lt 22],
-[emacs_cv_func_strnlen_working=no],
-	[emacs_cv_func_strnlen_working='guessing yes'])])])
-AS_IF([test "$emacs_cv_func_strnlen_working" != "no"],
-  [ORIGINAL_AC_FUNC_STRNLEN],
-  [ac_cv_func_strnlen_working=no
-   AC_LIBOBJ([strnlen])])])
-
 # Initialize gnulib right after choosing the compiler.
 dnl Amongst other things, this sets AR and ARFLAGS.
 gl_EARLY
diff --git a/m4/strnlen.m4 b/m4/strnlen.m4
index b4d2778524e..83a75c0c327 100644
--- a/m4/strnlen.m4
+++ b/m4/strnlen.m4
@@ -1,11 +1,60 @@
 # strnlen.m4
-# serial 14
+# serial 15
 dnl Copyright (C) 2002-2003, 2005-2007, 2009-2024 Free Software Foundation,
 dnl Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
 dnl with or without modifications, as long as this notice is preserved.
 
+m4_version_prereq([2.73], [], [
+# Replace AC_FUNC_STRNLEN from Autoconf 2.72 and earlier,
+# which does not check for Android strnlen bugs.
+
+AC_DEFUN([AC_FUNC_STRNLEN],
+[AC_REQUIRE([AC_USE_SYSTEM_EXTENSIONS])dnl
+AC_CACHE_CHECK([for working strnlen], [ac_cv_func_strnlen_working],
+[AC_RUN_IFELSE(
+   [AC_LANG_PROGRAM(
+  [AC_INCLUDES_DEFAULT
+   [/* Use pstrnlen to test; 'volatile' prevents the compiler
+   from optimizing the strnlen calls away.  */
+size_t (*volatile pstrnlen) (char const *, size_t) = strnlen;
+char const s[] = "foobar";
+int s_len = sizeof s - 1;
+   ]],
+  [[
+/* AIX 4.3 is buggy: strnlen (S, 1) == 3.  */
+int i;
+for (i = 0; i < s_len + 1; ++i)
+  {
+int expected = i <= s_len ? i : s_len;
+if (pstrnlen (s, i) != expected)
+  return 1;
+  }
+
+/* Android 5.0 (API 21) strnlen ("", SIZE_MAX) incorrectly crashes.  */
+if (pstrnlen ("", -1) != 0)
+  return 1;]])],
+   [ac_cv_func_strnlen_working=yes],
+   [ac_cv_func_strnlen_working=no],
+   [AC_COMPILE_IFELSE(
+  [AC_LANG_PROGRAM([AC_INCLUDES_DEFAULT],
+ [[#if defined _AIX && !defined _AIX51
+#error "AIX pre 5.1 is buggy"
+   #endif
+   #ifdef __ANDROID__
+#include 
+#if __ANDROID_API__ < 22
+ #error "Android API < 22 is buggy"
+#endif
+   #endif
+ ]])],
+  [ac_cv_func_strnlen_working=yes],
+  [ac_cv_func_strnlen_working=no])])])
+test $ac_cv_func_strnlen_working = no && AC_LIBOBJ([strnlen])
+])# AC_FUNC_STRNLEN
+])
+
 AC_DEFUN([gl_FUNC_STRNLEN],
 [
 

timespec_sub bug fix

2024-07-12 Thread Bruno Haible
I needed more infos about this bug fix (so I can know why the CI did
not report it within three weeks). Since bug-gnulib was not in CC, I had
to search. I updated this ChangeLog entry:


2024-07-10  Pip Cet  

timespec-sub: Fix compilation error on clang.
Reported by Gerd Möllmann  at
.
* lib/timespec-sub.c (timespec_sub): Use 'int' as type of variable.
Copyright-paperwork-exempt: Yes







Re: timespec_sub bug fix

2024-07-12 Thread Collin Funk
Bruno Haible  writes:

> I needed more infos about this bug fix (so I can know why the CI did
> not report it within three weeks). Since bug-gnulib was not in CC, I had
> to search. I updated this ChangeLog entry:

timespec-sub.c:38:12: error: operand argument to checked integer operation must 
be an integer type
  other than plain 'char', 'bool', bit-precise, or an enumeration ('bool' 
invalid)
   38 |   if (v == ckd_sub (&rs, rs, borrow))
  |^
/opt/homebrew/Cellar/llvm/18.1.8/lib/clang/18/include/stdckdint.h:38:54: note: 
expanded from macro
  'ckd_sub'
   38 | #define ckd_sub(R, A, B) __builtin_sub_overflow((A), (B), (R))
  |  ^

Good to know thanks.

I was going to say that I wish it just treated bools as 0/1 but per the
documentation for ckd_* in C23 § 7.20 [1]:

Both type2 and type3 shall be any integer type other than "plain"
char, bool, a bit-precise integer type, or an enumerated type, and
they need not be the same. *result shall be a modifiable lvalue of
any integer type other than "plain" char, bool, a bit-precise
integer type, or an enumerated type.

So I guess I can't complain.

Collin

[1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3096.pdf



Re: timespec_sub bug fix

2024-07-12 Thread Paul Eggert

On 7/13/24 04:20, Collin Funk wrote:

 Both type2 and type3 shall be any integer type other than "plain"
 char, bool, a bit-precise integer type, or an enumerated type, and
 they need not be the same.


Yes, and it's annoying that you can't use 'bool' there. In theory, for 
example, it means you can't do time_t arithmetic with ckd_add because 
time_t might be one of the prohibited types. If we had to worry about 
that possibility our code would be complicated needlessly - so I don't 
worry about it.


I don't know why that limitation is there. Although I was tempted to fix 
this particular issue with the one-byte fix of inserting "+", e.g., 
ckd_sub (&rs, rs, borrow) -> ckd_sub (&rs, rs, +borrow), I stuck with 
the bool -> int fix of the original reporter.