Paul Eggert wrote: > > By reading the source code of FreeBSD, NetBSD, OpenBSD, macOS, Solaris, > > and so on, I can easily determine > > - which parts of the mbstate_t mbsinit() tests, > > - which parts of the mbstate_t the various functions use. > > But in order to understand what interdependencies there are, between > > the various mbstate_t fields, and what are the assumed invariants, > > I would need to carefully read each of the mentioned files (one per > > OS and per locale type). > > Yes, and I did that for mbcel - that is, I looked at the source code for > every coding system used by mbrtoc32 on NetBSD, OpenBSD, FreeBSD, > Darwin, and DragonFly. The analysis was not as hard as one might think, > as mbrtoc32 quickly decides whether the state is initial, and mbrtoc32 > is all that matters for mbcel. > > I doubt whether other primitives like mbrlen would differ, though I did > not check this. Also, it's possible I made a mistake in analyzing > mbrtoc32, though I hope that's unlikely.
I did that analysis again, more carefully than previously, and found that for macOS, FreeBSD, NetBSD, OpenBSD, Solaris, zeroing the first 12 bytes of the mbstate_t should be sufficient. (Like you said.) However, after implementing mbszero with this data and enabling its use in many places, I got test failures on NetBSD and Solaris. - On NetBSD, the minimum we need to clear is 28 bytes. - On Solaris OmniOS and OpenIndiana, the minimum we need to clear is 16 bytes. - On proprietary Solaris, the minimum we need to clear is 20 or 28 bytes (depending on 32-bit or 64-bit mode). So, clearly this is fragile stuff. I'm committing it nevertheless, since it seems that we have a good enough test coverage to detect future changes. 2023-07-16 Bruno Haible <br...@clisp.org> dfa: Optimize clearing an mbstate_t. * lib/dfa.c (mbszero) [GAWK]: Add fallback definition. (mbs_to_wchar, lex, addtok_wc, dfaexec_main): Use mbszero. * modules/dfa (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> uchar-c23: Optimize clearing an mbstate_t. * lib/lc-charset-unicode.c (locale_encoding_to_unicode, unicode_to_locale_encoding): Use mbszero. * modules/uchar-c23 (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> quotearg: Optimize clearing an mbstate_t. * lib/quotearg.c: Include <wchar.h>. (quotearg_buffer_restyled): Use mbszero. * modules/quotearg (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> vasnprintf, vasnwprintf: Optimize clearing an mbstate_t. * lib/vasnprintf.c (VASNPRINTF): Use mbszero. * modules/vasnprintf (Depends-on): Add mbszero. * modules/vasnwprintf (Depends-on): Likewise. * modules/c-vasnprintf (Depends-on): Likewise. * modules/unistdio/u8-vasnprintf (Depends-on): Likewise. * modules/unistdio/u8-u8-vasnprintf (Depends-on): Likewise. * modules/unistdio/u16-vasnprintf (Depends-on): Likewise. * modules/unistdio/u16-u16-vasnprintf (Depends-on): Likewise. * modules/unistdio/u32-vasnprintf (Depends-on): Likewise. * modules/unistdio/u32-u32-vasnprintf (Depends-on): Likewise. * modules/unistdio/ulc-vasnprintf (Depends-on): Likewise. 2023-07-16 Bruno Haible <br...@clisp.org> mbmemcasecoll: Optimize clearing an mbstate_t. * lib/mbmemcasecoll.c (apply_c32tolower): Use mbszero. * modules/mbmemcasecoll (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> mbswidth: Optimize clearing an mbstate_t. * lib/mbswidth.c (mbsnwidth): Use mbszero. * modules/mbswidth (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> mbfile: Optimize clearing an mbstate_t. * lib/mbfile.h (mbfile_multi_getc, mbf_init): Use mbszero. * modules/mbfile (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> mbuiter: Optimize clearing an mbstate_t. * lib/mbuiter.h: Include <wchar.h>. (mbuiter_multi_next, mbuiter_multi_copy, mbui_init): Use mbszero. * modules/mbuiter (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> mbiter: Optimize clearing an mbstate_t. * lib/mbiter.h: Include <wchar.h>. (mbiter_multi_next, mbiter_multi_copy, mbi_init): Use mbszero. * modules/mbiter (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> c32stombs: Optimize clearing an mbstate_t. * lib/c32stombs.c (c32stombs): Use mbszero. * lib/uchar.in.h (c32stombs): Likewise. * modules/c32stombs (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> mbstoc32s: Optimize clearing an mbstate_t. * lib/mbstoc32s.c (mbstoc32s): Use mbszero. * lib/uchar.in.h (mbstoc32s): Likewise. * modules/mbstoc32s (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> mbstowcs: Optimize clearing an mbstate_t. * lib/mbstowcs.c (mbstowcs): Use mbszero. * modules/mbstowcs (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> c32tob: Optimize clearing an mbstate_t. * lib/c32tob.c (c32tob): Use mbszero. * modules/c32tob (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> wctomb: Optimize clearing an mbstate_t. * lib/wctomb-impl.h (wctomb): Use mbszero. * modules/wctomb (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> btoc32: Optimize clearing an mbstate_t. * lib/btoc32.c: Include <wchar.h>. (btoc32): Use mbszero. * modules/btoc32 (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> btowc: Optimize clearing an mbstate_t. * lib/btowc.c (btowc): Use mbszero. * modules/btowc (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> mbrtoc32: Optimize clearing an mbstate_t. * lib/mbrtoc32.c (mbrtoc32): Use mbszero. * modules/mbrtoc32 (Depends-on): Add mbsinit, mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> mbtowc: Optimize clearing an mbstate_t. * lib/mbtowc-impl.h (mbtowc): Use mbszero. * modules/mbtowc (Depends-on): Add mbszero. 2023-07-16 Bruno Haible <br...@clisp.org> mbszero: New module. * lib/wchar.in.h: Include <string.h>. (_GL_MBSTATE_INIT_SIZE, _GL_MBSTATE_ZERO_SIZE): New macros. (mbszero): New declaration. * lib/mbrtoc16.c: Update comments. * lib/mbszero.c: New file. * m4/wchar_h.m4 (gl_WCHAR_H_REQUIRE_DEFAULTS): Initialize GNULIB_MBSZERO. * modules/wchar (Depends-on): Add extern-inline. (Makefile.am): Substitute GNULIB_MBSZERO. * modules/mbszero: New file.
From c96c9e2d50f46edcd4e56b821bc469949c0dc14e Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:15 +0200 Subject: [PATCH 01/19] mbszero: New module. * lib/wchar.in.h: Include <string.h>. (_GL_MBSTATE_INIT_SIZE, _GL_MBSTATE_ZERO_SIZE): New macros. (mbszero): New declaration. * lib/mbrtoc16.c: Update comments. * lib/mbszero.c: New file. * m4/wchar_h.m4 (gl_WCHAR_H_REQUIRE_DEFAULTS): Initialize GNULIB_MBSZERO. * modules/wchar (Depends-on): Add extern-inline. (Makefile.am): Substitute GNULIB_MBSZERO. * modules/mbszero: New file. --- ChangeLog | 14 ++++ lib/mbrtoc16.c | 14 ++-- lib/mbszero.c | 23 ++++++ lib/wchar.in.h | 204 ++++++++++++++++++++++++++++++++++++++++++++++++ m4/wchar_h.m4 | 3 +- modules/mbszero | 29 +++++++ modules/wchar | 2 + 7 files changed, 282 insertions(+), 7 deletions(-) create mode 100644 lib/mbszero.c create mode 100644 modules/mbszero diff --git a/ChangeLog b/ChangeLog index 68235627fb..151d9b4537 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,17 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + mbszero: New module. + * lib/wchar.in.h: Include <string.h>. + (_GL_MBSTATE_INIT_SIZE, _GL_MBSTATE_ZERO_SIZE): New macros. + (mbszero): New declaration. + * lib/mbrtoc16.c: Update comments. + * lib/mbszero.c: New file. + * m4/wchar_h.m4 (gl_WCHAR_H_REQUIRE_DEFAULTS): Initialize + GNULIB_MBSZERO. + * modules/wchar (Depends-on): Add extern-inline. + (Makefile.am): Substitute GNULIB_MBSZERO. + * modules/mbszero: New file. + 2023-07-15 Bruno Haible <br...@clisp.org> mbsinit: Fix module description. diff --git a/lib/mbrtoc16.c b/lib/mbrtoc16.c index eb73e7d447..1afcde44cc 100644 --- a/lib/mbrtoc16.c +++ b/lib/mbrtoc16.c @@ -54,24 +54,26 @@ static_assert (sizeof (mbstate_t) >= 4); /* macOS, FreeBSD, NetBSD, OpenBSD, Minix */ /* On macOS, mbstate_t is defined in <machine/_types.h>. It is an opaque aligned 128-byte struct, of which at most the first - 20 bytes are used (the members are at most: 2x wchar_t, 2x int, 4x char). + 12 bytes are used. For more details, see the __mbsinit implementations in Libc-<version>/locale/FreeBSD/ {ascii,none,euc,mskanji,big5,gb2312,gbk,gb18030,utf8,utf2}.c. */ /* On FreeBSD, mbstate_t is defined in src/sys/sys/_types.h. It is an opaque aligned 128-byte struct, of which at most the first - 20 bytes are used (the members are at most: 2x wchar_t, 2x int, 4x char). + 12 bytes are used. For more details, see the __mbsinit implementations in src/lib/libc/locale/ {ascii,none,euc,mskanji,big5,gb2312,gbk,gb18030,utf8}.c. */ /* On NetBSD, mbstate_t is defined in src/sys/sys/ansi.h. It is an opaque aligned 128-byte struct, of which at most the first - 24 bytes are used (the members are at most: 3x int, 12x char). + 28 bytes are used. For more details, see the *State types in - src/lib/libc/citrus/modules/citrus_*.c. */ + src/lib/libc/citrus/modules/citrus_*.c + (ignoring citrus_{hz,iso2022,utf7,viqr,zw}.c, since these implement + stateful encodings, not usable as locale encodings). */ /* On OpenBSD, mbstate_t is defined in src/sys/sys/_types.h. It is an opaque aligned 128-byte struct, of which at most the first - 12 bytes are used (the members are at most: 2x wchar_t, 1x int). + 12 bytes are used. For more details, see src/lib/libc/citrus/citrus_*.c. */ /* Minix has borrowed its mbstate_t type and mbrtowc implementation from the BSDs. */ @@ -84,7 +86,7 @@ static_assert (sizeof (mbstate_t) >= 4); #elif defined __sun /* Solaris */ /* On Solaris, mbstate_t is defined in <wchar_impl.h>. It is an opaque aligned 24-byte or 32-byte struct, of which at most the first - 20 bytes are used (the members are at most: 2x wchar_t, 2x int, 4x char). + 20 or 28 bytes are used. For more details, see the *State types in illumos-gate/usr/src/lib/libc/port/locale/ {none,euc,mskanji,big5,gb2312,gbk,gb18030,utf8}.c. */ diff --git a/lib/mbszero.c b/lib/mbszero.c new file mode 100644 index 0000000000..6da91c6b20 --- /dev/null +++ b/lib/mbszero.c @@ -0,0 +1,23 @@ +/* Put an mbstate_t into an initial conversion state. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is free software: you can redistribute it and/or modify + it under the terms of the GNU Lesser General Public License as + published by the Free Software Foundation; either version 2.1 of the + License, or (at your option) any later version. + + This file is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public License + along with this program. If not, see <https://www.gnu.org/licenses/>. */ + +/* Written by Bruno Haible <br...@clisp.org>, 2023. */ + +#include <config.h> + +#define IN_MBSZERO +/* Specification and implementation. */ +#include <wchar.h> diff --git a/lib/wchar.in.h b/lib/wchar.in.h index 2c878ff815..7d2c5ecd12 100644 --- a/lib/wchar.in.h +++ b/lib/wchar.in.h @@ -232,6 +232,13 @@ _GL_EXTERN_C void free (void *); # endif #endif + +#if @GNULIB_MBSZERO@ +/* Get memset(). */ +# include <string.h> +#endif + + /* Convert a single-byte character to a wide character. */ #if @GNULIB_BTOWC@ # if @REPLACE_BTOWC@ @@ -315,6 +322,203 @@ _GL_WARN_ON_USE (mbsinit, "mbsinit is unportable - " #endif +/* Put *PS into an initial state. */ +#if @GNULIB_MBSZERO@ +/* ISO C 23 ยง 7.31.6.(3) says that zeroing an mbstate_t is a way to put the + mbstate_t into an initial state. However, on many platforms an mbstate_t + is large, and it is possible - as an optimization - to get away with zeroing + only part of it. So, instead of + + mbstate_t state = { 0 }; + + or + + mbstate_t state; + memset (&state, 0, sizeof (mbstate_t)); + + we can write this faster code: + + mbstate_t state; + mbszero (&state); + */ +/* _GL_MBSTATE_INIT_SIZE describes how mbsinit() behaves: It is the number of + bytes at the beginning of an mbstate_t that need to be zero, for mbsinit() + to return true. + _GL_MBSTATE_ZERO_SIZE is the number of bytes at the beginning of an mbstate_t + that need to be zero, + - for mbsinit() to return true, and + - for all other multibyte-aware functions to operate properly. + 0 < _GL_MBSTATE_INIT_SIZE <= _GL_MBSTATE_ZERO_SIZE <= sizeof (mbstate_t). + These values are determined by source code inspection. */ +# if GNULIB_defined_mbstate_t /* AIX, IRIX */ +/* mbstate_t has at least 4 bytes. They are used as coded in + gnulib/lib/mbrtowc.c. */ +# define _GL_MBSTATE_INIT_SIZE 1 +/* Note that 4 is not the correct value: it causes test failures. */ +# define _GL_MBSTATE_ZERO_SIZE sizeof (mbstate_t) +# elif __GLIBC__ >= 2 /* glibc */ +/* mbstate_t is defined in <bits/types/__mbstate_t.h>. + For more details, see glibc/iconv/skeleton.c. */ +# define _GL_MBSTATE_INIT_SIZE 4 +# define _GL_MBSTATE_ZERO_SIZE /* 8 */ sizeof (mbstate_t) +# elif defined MUSL_LIBC /* musl libc */ +/* mbstate_t is defined in <bits/alltypes.h>. + It is an opaque aligned 8-byte struct, of which at most the first + 4 bytes are used. + For more details, see src/multibyte/mbrtowc.c. */ +# define _GL_MBSTATE_INIT_SIZE 4 +# define _GL_MBSTATE_ZERO_SIZE 4 +# elif defined __APPLE__ && defined __MACH__ /* macOS */ +/* On macOS, mbstate_t is defined in <machine/_types.h>. + It is an opaque aligned 128-byte struct, of which at most the first + 12 bytes are used. + For more details, see the __mbsinit implementations in + Libc-<version>/locale/FreeBSD/ + {ascii,none,euc,mskanji,big5,gb2312,gbk,gb18030,utf8,utf2}.c. */ +/* File INIT_SIZE ZERO_SIZE + ascii.c 0 0 + none.c 0 0 + euc.c 12 12 + mskanji.c 4 4 + big5.c 4 4 + gb2312.c 4 6 + gbk.c 4 4 + gb18030.c 4 8 + utf8.c 8 10 + utf2.c 8 12 */ +# define _GL_MBSTATE_INIT_SIZE 12 +# define _GL_MBSTATE_ZERO_SIZE 12 +# elif defined __FreeBSD__ /* FreeBSD */ +/* On FreeBSD, mbstate_t is defined in src/sys/sys/_types.h. + It is an opaque aligned 128-byte struct, of which at most the first + 12 bytes are used. + For more details, see the __mbsinit implementations in + src/lib/libc/locale/ + {ascii,none,euc,mskanji,big5,gb2312,gbk,gb18030,utf8}.c. */ +/* File INIT_SIZE ZERO_SIZE + ascii.c 0 0 + none.c 0 0 + euc.c 12 12 + mskanji.c 4 4 + big5.c 4 4 + gb2312.c 4 6 + gbk.c 4 4 + gb18030.c 4 8 + utf8.c 8 12 */ +# define _GL_MBSTATE_INIT_SIZE 12 +# define _GL_MBSTATE_ZERO_SIZE 12 +# elif defined __NetBSD__ /* NetBSD */ +/* On NetBSD, mbstate_t is defined in src/sys/sys/ansi.h. + It is an opaque aligned 128-byte struct, of which at most the first + 28 bytes are used. + For more details, see the *State types in + src/lib/libc/citrus/modules/citrus_*.c + (ignoring citrus_{hz,iso2022,utf7,viqr,zw}.c, since these implement + stateful encodings, not usable as locale encodings). */ +/* File ZERO_SIZE + citrus/citrus_none.c 0 + citrus/modules/citrus_euc.c 8 + citrus/modules/citrus_euctw.c 8 + citrus/modules/citrus_mskanji.c 8 + citrus/modules/citrus_big5.c 8 + citrus/modules/citrus_gbk2k.c 8 + citrus/modules/citrus_dechanyu.c 8 + citrus/modules/citrus_johab.c 6 + citrus/modules/citrus_utf8.c 12 */ +/* But 12 is not the correct value: we get test failures for values < 28. */ +# define _GL_MBSTATE_INIT_SIZE 28 +# define _GL_MBSTATE_ZERO_SIZE 28 +# elif defined __OpenBSD__ /* OpenBSD */ +/* On OpenBSD, mbstate_t is defined in src/sys/sys/_types.h. + It is an opaque aligned 128-byte struct, of which at most the first + 12 bytes are used. + For more details, see src/lib/libc/citrus/citrus_*.c. */ +/* File INIT_SIZE ZERO_SIZE + citrus_none.c 0 0 + citrus_utf8.c 12 12 */ +# define _GL_MBSTATE_INIT_SIZE 12 +# define _GL_MBSTATE_ZERO_SIZE 12 +# elif defined __minix /* Minix */ +/* On Minix, mbstate_t is defined in sys/sys/ansi.h. + It is an opaque aligned 128-byte struct. + For more details, see the *State types in + lib/libc/citrus/citrus_*.c. */ +/* File INIT_SIZE ZERO_SIZE + citrus_none.c 0 0 */ +# define _GL_MBSTATE_INIT_SIZE 1 +# define _GL_MBSTATE_ZERO_SIZE 1 +# elif defined __sun /* Solaris */ +/* On Solaris, mbstate_t is defined in <wchar_impl.h>. + It is an opaque aligned 24-byte or 32-byte struct, of which at most the first + 20 or 28 bytes are used. + For more details, see the *State types in + illumos-gate/usr/src/lib/libc/port/locale/ + {none,euc,mskanji,big5,gb2312,gbk,gb18030,utf8}.c. */ +/* File INIT_SIZE ZERO_SIZE + none.c 0 0 + euc.c 12 12 + mskanji.c 4 4 + big5.c 4 4 + gb2312.c 4 6 + gbk.c 4 4 + gb18030.c 4 8 + utf8.c 12 12 */ +/* But 12 is not the correct value: we get test failures + - in OpenIndiana and OmniOS: for values < 16, + - in Solaris 10 and 11: for values < 20 (in 32-bit mode) + or < 28 (in 64-bit mode). */ +# if defined _LP64 +# define _GL_MBSTATE_INIT_SIZE 28 +# define _GL_MBSTATE_ZERO_SIZE 28 +# else +# define _GL_MBSTATE_INIT_SIZE 20 +# define _GL_MBSTATE_ZERO_SIZE 20 +# endif +# elif defined __CYGWIN__ /* Cygwin */ +/* On Cygwin, mbstate_t is defined in <sys/_types.h>. + For more details, see newlib/libc/stdlib/mbtowc_r.c and + winsup/cygwin/strfuncs.cc. */ +# define _GL_MBSTATE_INIT_SIZE 4 +# define _GL_MBSTATE_ZERO_SIZE 8 +# elif defined _WIN32 && !defined __CYGWIN__ /* Native Windows. */ +/* MSVC defines 'mbstate_t' as an aligned 8-byte struct. + On mingw, 'mbstate_t' is sometimes defined as 'int', sometimes defined + as an aligned 8-byte struct, of which the first 4 bytes matter. */ +# define _GL_MBSTATE_INIT_SIZE sizeof (mbstate_t) +# define _GL_MBSTATE_ZERO_SIZE sizeof (mbstate_t) +# elif defined __ANDROID__ /* Android */ +/* Android defines 'mbstate_t' in <bits/mbstate_t.h>. + It is an opaque 4-byte or 8-byte struct. + For more details, see + bionic/libc/private/bionic_mbstate.h + bionic/libc/bionic/mbrtoc32.cpp + bionic/libc/bionic/mbrtoc16.cpp + */ +# define _GL_MBSTATE_INIT_SIZE 4 +# define _GL_MBSTATE_ZERO_SIZE 4 +# else +/* On platforms where we don't know how the multibyte functions behave, use + these safe values. */ +# define _GL_MBSTATE_INIT_SIZE sizeof (mbstate_t) +# define _GL_MBSTATE_ZERO_SIZE sizeof (mbstate_t) +# endif +_GL_BEGIN_C_LINKAGE +# if defined IN_MBSZERO +_GL_EXTERN_INLINE +# else +_GL_INLINE +# endif +_GL_ARG_NONNULL ((1)) void +mbszero (mbstate_t *ps) +{ + memset (ps, 0, _GL_MBSTATE_ZERO_SIZE); +} +_GL_END_C_LINKAGE +_GL_CXXALIAS_SYS (mbszero, void, (mbstate_t *ps)); +_GL_CXXALIASWARN (mbszero); +#endif + + /* Convert a multibyte character to a wide character. */ #if @GNULIB_MBRTOWC@ # if @REPLACE_MBRTOWC@ diff --git a/m4/wchar_h.m4 b/m4/wchar_h.m4 index 442932be44..31f5b0794d 100644 --- a/m4/wchar_h.m4 +++ b/m4/wchar_h.m4 @@ -7,7 +7,7 @@ dnl Written by Eric Blake. -# wchar_h.m4 serial 60 +# wchar_h.m4 serial 61 AC_DEFUN_ONCE([gl_WCHAR_H], [ @@ -147,6 +147,7 @@ AC_DEFUN([gl_WCHAR_H_REQUIRE_DEFAULTS] gl_MODULE_INDICATOR_INIT_VARIABLE([GNULIB_BTOWC]) gl_MODULE_INDICATOR_INIT_VARIABLE([GNULIB_WCTOB]) gl_MODULE_INDICATOR_INIT_VARIABLE([GNULIB_MBSINIT]) + gl_MODULE_INDICATOR_INIT_VARIABLE([GNULIB_MBSZERO]) gl_MODULE_INDICATOR_INIT_VARIABLE([GNULIB_MBRTOWC]) gl_MODULE_INDICATOR_INIT_VARIABLE([GNULIB_MBRLEN]) gl_MODULE_INDICATOR_INIT_VARIABLE([GNULIB_MBSRTOWCS]) diff --git a/modules/mbszero b/modules/mbszero new file mode 100644 index 0000000000..002fa75014 --- /dev/null +++ b/modules/mbszero @@ -0,0 +1,29 @@ +Description: +mbszero() function: put an mbstate_t into an initial conversion state. + +Files: +lib/mbszero.c +m4/mbstate_t.m4 +m4/mbrtowc.m4 +m4/musl.m4 + +Depends-on: +wchar + +configure.ac: +AC_REQUIRE([AC_TYPE_MBSTATE_T]) +gl_MBSTATE_T_BROKEN +gl_MUSL_LIBC +gl_WCHAR_MODULE_INDICATOR([mbszero]) + +Makefile.am: +lib_SOURCES += mbszero.c + +Include: +<wchar.h> + +License: +LGPLv2+ + +Maintainer: +all diff --git a/modules/wchar b/modules/wchar index 6a17e53034..ebdd0ece61 100644 --- a/modules/wchar +++ b/modules/wchar @@ -13,6 +13,7 @@ include_next snippet/arg-nonnull snippet/c++defs snippet/warn-on-use +extern-inline inttypes-incomplete stddef stdlib @@ -42,6 +43,7 @@ wchar.h: wchar.in.h $(top_builddir)/config.status $(CXXDEFS_H) $(ARG_NONNULL_H) -e 's/@''GNULIB_BTOWC''@/$(GNULIB_BTOWC)/g' \ -e 's/@''GNULIB_WCTOB''@/$(GNULIB_WCTOB)/g' \ -e 's/@''GNULIB_MBSINIT''@/$(GNULIB_MBSINIT)/g' \ + -e 's/@''GNULIB_MBSZERO''@/$(GNULIB_MBSZERO)/g' \ -e 's/@''GNULIB_MBRTOWC''@/$(GNULIB_MBRTOWC)/g' \ -e 's/@''GNULIB_MBRLEN''@/$(GNULIB_MBRLEN)/g' \ -e 's/@''GNULIB_MBSRTOWCS''@/$(GNULIB_MBSRTOWCS)/g' \ -- 2.34.1
>From 4159dd5027ee68631e3514aa31f614a5b0d08303 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:20 +0200 Subject: [PATCH 02/19] mbtowc: Optimize clearing an mbstate_t. * lib/mbtowc-impl.h (mbtowc): Use mbszero. * modules/mbtowc (Depends-on): Add mbszero. --- ChangeLog | 6 ++++++ lib/mbtowc-impl.h | 2 +- modules/mbtowc | 3 ++- 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/ChangeLog b/ChangeLog index 151d9b4537..e32eaf8ad4 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + mbtowc: Optimize clearing an mbstate_t. + * lib/mbtowc-impl.h (mbtowc): Use mbszero. + * modules/mbtowc (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> mbszero: New module. diff --git a/lib/mbtowc-impl.h b/lib/mbtowc-impl.h index 39b977bebc..01fef1823d 100644 --- a/lib/mbtowc-impl.h +++ b/lib/mbtowc-impl.h @@ -30,7 +30,7 @@ mbtowc (wchar_t *pwc, const char *s, size_t n) wchar_t wc; size_t result; - memset (&state, 0, sizeof (mbstate_t)); + mbszero (&state); result = mbrtowc (&wc, s, n, &state); if (result == (size_t)-1 || result == (size_t)-2) { diff --git a/modules/mbtowc b/modules/mbtowc index fcfb1cc431..12cc3de62d 100644 --- a/modules/mbtowc +++ b/modules/mbtowc @@ -8,8 +8,9 @@ m4/mbtowc.m4 Depends-on: stdlib -mbrtowc [test $HAVE_MBTOWC = 0 || test $REPLACE_MBTOWC = 1] wchar [test $HAVE_MBTOWC = 0 || test $REPLACE_MBTOWC = 1] +mbszero [test $HAVE_MBTOWC = 0 || test $REPLACE_MBTOWC = 1] +mbrtowc [test $HAVE_MBTOWC = 0 || test $REPLACE_MBTOWC = 1] configure.ac: gl_FUNC_MBTOWC -- 2.34.1
>From e00ed30ced5b3546eee89823c1ed70ec5c2dcaa7 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:22 +0200 Subject: [PATCH 03/19] mbrtoc32: Optimize clearing an mbstate_t. * lib/mbrtoc32.c (mbrtoc32): Use mbszero. * modules/mbrtoc32 (Depends-on): Add mbsinit, mbszero. --- ChangeLog | 6 ++++++ lib/mbrtoc32.c | 2 +- modules/mbrtoc32 | 2 ++ 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index e32eaf8ad4..058de1e2cd 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + mbrtoc32: Optimize clearing an mbstate_t. + * lib/mbrtoc32.c (mbrtoc32): Use mbszero. + * modules/mbrtoc32 (Depends-on): Add mbsinit, mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> mbtowc: Optimize clearing an mbstate_t. diff --git a/lib/mbrtoc32.c b/lib/mbrtoc32.c index 52bdde2482..558717d517 100644 --- a/lib/mbrtoc32.c +++ b/lib/mbrtoc32.c @@ -130,7 +130,7 @@ mbrtoc32 (char32_t *pwc, const char *s, size_t n, mbstate_t *ps) /* Verify that mbrtoc32 is regular. */ if (ret < (size_t) -3 && ! mbsinit (ps)) /* This occurs on glibc 2.36. */ - memset (ps, '\0', sizeof (mbstate_t)); + mbszero (ps); if (ret == (size_t) -3) abort (); # endif diff --git a/modules/mbrtoc32 b/modules/mbrtoc32 index 061e8a7aec..7315fe1505 100644 --- a/modules/mbrtoc32 +++ b/modules/mbrtoc32 @@ -25,6 +25,8 @@ attribute [test $HAVE_MBRTOC32 = 0 || test $REPLACE_MBRTOC32 = 1] c99 [{ test $HAVE_MBRTOC32 = 0 || test $REPLACE_MBRTOC32 = 1; } && test $REPLACE_MBSTATE_T = 0] hard-locale [{ test $HAVE_MBRTOC32 = 0 || test $REPLACE_MBRTOC32 = 1; } && test $REPLACE_MBSTATE_T = 0] mbrtowc [{ test $HAVE_MBRTOC32 = 0 || test $REPLACE_MBRTOC32 = 1; } && test $REPLACE_MBSTATE_T = 0] +mbsinit [{ test $HAVE_MBRTOC32 = 0 || test $REPLACE_MBRTOC32 = 1; } && test $REPLACE_MBSTATE_T = 0] +mbszero [{ test $HAVE_MBRTOC32 = 0 || test $REPLACE_MBRTOC32 = 1; } && test $REPLACE_MBSTATE_T = 0] assert-h [test $HAVE_MBRTOC32 = 0 || test $REPLACE_MBRTOC32 = 1] localcharset [test $HAVE_MBRTOC32 = 0 || test $REPLACE_MBRTOC32 = 1] streq [test $HAVE_MBRTOC32 = 0 || test $REPLACE_MBRTOC32 = 1] -- 2.34.1
>From c05bdfe0d5b4627b9b528067568d8bac3986fdce Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:25 +0200 Subject: [PATCH 04/19] btowc: Optimize clearing an mbstate_t. * lib/btowc.c (btowc): Use mbszero. * modules/btowc (Depends-on): Add mbszero. --- ChangeLog | 6 ++++++ lib/btowc.c | 2 +- modules/btowc | 3 ++- 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/ChangeLog b/ChangeLog index 058de1e2cd..8694418f5a 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + btowc: Optimize clearing an mbstate_t. + * lib/btowc.c (btowc): Use mbszero. + * modules/btowc (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> mbrtoc32: Optimize clearing an mbstate_t. diff --git a/lib/btowc.c b/lib/btowc.c index 4defbdda72..13ceab2e90 100644 --- a/lib/btowc.c +++ b/lib/btowc.c @@ -35,7 +35,7 @@ btowc (int c) buf[0] = c; #if HAVE_MBRTOWC mbstate_t state; - memset (&state, 0, sizeof (mbstate_t)); + mbszero (&state); size_t ret = mbrtowc (&wc, buf, 1, &state); if (!(ret == (size_t)(-1) || ret == (size_t)(-2))) #else diff --git a/modules/btowc b/modules/btowc index 237186322f..99957c83b0 100644 --- a/modules/btowc +++ b/modules/btowc @@ -10,8 +10,9 @@ m4/codeset.m4 Depends-on: wchar -mbtowc [test $HAVE_BTOWC = 0 || test $REPLACE_BTOWC = 1] +mbszero [test $HAVE_BTOWC = 0 || test $REPLACE_BTOWC = 1] mbrtowc [test $HAVE_BTOWC = 0 || test $REPLACE_BTOWC = 1] +mbtowc [test $HAVE_BTOWC = 0 || test $REPLACE_BTOWC = 1] configure.ac: gl_FUNC_BTOWC -- 2.34.1
>From 9ca0b47c2ae72c9c55f7443c2bf68feacb95290a Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:27 +0200 Subject: [PATCH 05/19] btoc32: Optimize clearing an mbstate_t. * lib/btoc32.c: Include <wchar.h>. (btoc32): Use mbszero. * modules/btoc32 (Depends-on): Add mbszero. --- ChangeLog | 7 +++++++ lib/btoc32.c | 3 ++- modules/btoc32 | 1 + 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index 8694418f5a..efb0dbf600 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,10 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + btoc32: Optimize clearing an mbstate_t. + * lib/btoc32.c: Include <wchar.h>. + (btoc32): Use mbszero. + * modules/btoc32 (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> btowc: Optimize clearing an mbstate_t. diff --git a/lib/btoc32.c b/lib/btoc32.c index c5ed227a8b..4d5b9067e7 100644 --- a/lib/btoc32.c +++ b/lib/btoc32.c @@ -24,6 +24,7 @@ #include <stdio.h> #include <string.h> +#include <wchar.h> #if GL_CHAR32_T_IS_UNICODE # include "lc-charset-unicode.h" @@ -44,7 +45,7 @@ btoc32 (int c) char s[1]; char32_t wc; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); s[0] = (unsigned char) c; if (mbrtoc32 (&wc, s, 1, &state) <= 1) return wc; diff --git a/modules/btoc32 b/modules/btoc32 index 1a532a9eb3..e6385398ec 100644 --- a/modules/btoc32 +++ b/modules/btoc32 @@ -6,6 +6,7 @@ lib/btoc32.c Depends-on: uchar +mbszero mbrtoc32 btowc -- 2.34.1
>From e5799150ce57b76f8a8bd834eee3e9b4af6f87ef Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:30 +0200 Subject: [PATCH 06/19] wctomb: Optimize clearing an mbstate_t. * lib/wctomb-impl.h (wctomb): Use mbszero. * modules/wctomb (Depends-on): Add mbszero. --- ChangeLog | 6 ++++++ lib/wctomb-impl.h | 2 +- modules/wctomb | 3 ++- 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/ChangeLog b/ChangeLog index efb0dbf600..0bfe766a63 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + wctomb: Optimize clearing an mbstate_t. + * lib/wctomb-impl.h (wctomb): Use mbszero. + * modules/wctomb (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> btoc32: Optimize clearing an mbstate_t. diff --git a/lib/wctomb-impl.h b/lib/wctomb-impl.h index 9ba727edf3..e71a15d9e7 100644 --- a/lib/wctomb-impl.h +++ b/lib/wctomb-impl.h @@ -25,7 +25,7 @@ wctomb (char *s, wchar_t wc) mbstate_t state; size_t result; - memset (&state, 0, sizeof (mbstate_t)); + mbszero (&state); result = wcrtomb (s, wc, &state); if (result == (size_t)-1) return -1; diff --git a/modules/wctomb b/modules/wctomb index dec45af5c7..10846fa40f 100644 --- a/modules/wctomb +++ b/modules/wctomb @@ -8,8 +8,9 @@ m4/wctomb.m4 Depends-on: stdlib -wcrtomb [test $REPLACE_WCTOMB = 1] wchar [test $REPLACE_WCTOMB = 1] +mbszero [test $REPLACE_WCTOMB = 1] +wcrtomb [test $REPLACE_WCTOMB = 1] configure.ac: gl_FUNC_WCTOMB -- 2.34.1
>From 5e121a71e762da84ce4191b944c4f7dc3d4a8c60 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:33 +0200 Subject: [PATCH 07/19] c32tob: Optimize clearing an mbstate_t. * lib/c32tob.c (c32tob): Use mbszero. * modules/c32tob (Depends-on): Add mbszero. --- ChangeLog | 6 ++++++ lib/c32tob.c | 2 +- modules/c32tob | 1 + 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index 0bfe766a63..d322288f8f 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + c32tob: Optimize clearing an mbstate_t. + * lib/c32tob.c (c32tob): Use mbszero. + * modules/c32tob (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> wctomb: Optimize clearing an mbstate_t. diff --git a/lib/c32tob.c b/lib/c32tob.c index f0e0c35ef9..85df66a32e 100644 --- a/lib/c32tob.c +++ b/lib/c32tob.c @@ -44,7 +44,7 @@ c32tob (wint_t wc) mbstate_t state; char buf[8]; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); if (c32rtomb (buf, wc, &state) == 1) return (unsigned char) buf[0]; } diff --git a/modules/c32tob b/modules/c32tob index 51b1d80359..11717a78f1 100644 --- a/modules/c32tob +++ b/modules/c32tob @@ -10,6 +10,7 @@ m4/codeset.m4 Depends-on: uchar +mbszero c32rtomb wctob -- 2.34.1
>From 897f2f1d9820b85d07e57a8ed18afc48ef160abb Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:35 +0200 Subject: [PATCH 08/19] mbstowcs: Optimize clearing an mbstate_t. * lib/mbstowcs.c (mbstowcs): Use mbszero. * modules/mbstowcs (Depends-on): Add mbszero. --- ChangeLog | 6 ++++++ lib/mbstowcs.c | 2 +- modules/mbstowcs | 1 + 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index d322288f8f..5b01e49aef 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + mbstowcs: Optimize clearing an mbstate_t. + * lib/mbstowcs.c (mbstowcs): Use mbszero. + * modules/mbstowcs (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> c32tob: Optimize clearing an mbstate_t. diff --git a/lib/mbstowcs.c b/lib/mbstowcs.c index e32d9acf88..c25467c49d 100644 --- a/lib/mbstowcs.c +++ b/lib/mbstowcs.c @@ -28,6 +28,6 @@ mbstowcs (wchar_t *dest, const char *src, size_t len) { mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); return mbsrtowcs (dest, &src, len, &state); } diff --git a/modules/mbstowcs b/modules/mbstowcs index 44ba43d977..e41e0a5938 100644 --- a/modules/mbstowcs +++ b/modules/mbstowcs @@ -8,6 +8,7 @@ m4/mbrtowc.m4 Depends-on: stdlib +mbszero [test $REPLACE_MBSTOWCS = 1] mbsrtowcs [test $REPLACE_MBSTOWCS = 1] configure.ac: -- 2.34.1
>From b1a67b984f16fc64b802735e654ec8aad977c7af Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:37 +0200 Subject: [PATCH 09/19] mbstoc32s: Optimize clearing an mbstate_t. * lib/mbstoc32s.c (mbstoc32s): Use mbszero. * lib/uchar.in.h (mbstoc32s): Likewise. * modules/mbstoc32s (Depends-on): Add mbszero. --- ChangeLog | 7 +++++++ lib/mbstoc32s.c | 2 +- lib/uchar.in.h | 2 +- modules/mbstoc32s | 1 + 4 files changed, 10 insertions(+), 2 deletions(-) diff --git a/ChangeLog b/ChangeLog index 5b01e49aef..b5d26043aa 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,10 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + mbstoc32s: Optimize clearing an mbstate_t. + * lib/mbstoc32s.c (mbstoc32s): Use mbszero. + * lib/uchar.in.h (mbstoc32s): Likewise. + * modules/mbstoc32s (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> mbstowcs: Optimize clearing an mbstate_t. diff --git a/lib/mbstoc32s.c b/lib/mbstoc32s.c index 0dac80e8b0..0cd40d75d0 100644 --- a/lib/mbstoc32s.c +++ b/lib/mbstoc32s.c @@ -32,6 +32,6 @@ mbstoc32s (char32_t *dest, const char *src, size_t len) { mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); return mbsrtoc32s (dest, &src, len, &state); } diff --git a/lib/uchar.in.h b/lib/uchar.in.h index 1c2bd008f5..b3fa531d08 100644 --- a/lib/uchar.in.h +++ b/lib/uchar.in.h @@ -654,7 +654,7 @@ mbstoc32s (char32_t *dest, const char *src, size_t len) { mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); return mbsrtoc32s (dest, &src, len, &state); } _GL_END_C_LINKAGE diff --git a/modules/mbstoc32s b/modules/mbstoc32s index 1c554202c9..f926ef1561 100644 --- a/modules/mbstoc32s +++ b/modules/mbstoc32s @@ -7,6 +7,7 @@ lib/mbstoc32s.c Depends-on: uchar wchar +mbszero mbsrtoc32s configure.ac: -- 2.34.1
>From e568a5a3ba4f2de4a123c0734cc4226597a91713 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:41 +0200 Subject: [PATCH 10/19] c32stombs: Optimize clearing an mbstate_t. * lib/c32stombs.c (c32stombs): Use mbszero. * lib/uchar.in.h (c32stombs): Likewise. * modules/c32stombs (Depends-on): Add mbszero. --- ChangeLog | 7 +++++++ lib/c32stombs.c | 2 +- lib/uchar.in.h | 2 +- modules/c32stombs | 1 + 4 files changed, 10 insertions(+), 2 deletions(-) diff --git a/ChangeLog b/ChangeLog index b5d26043aa..d89119176f 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,10 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + c32stombs: Optimize clearing an mbstate_t. + * lib/c32stombs.c (c32stombs): Use mbszero. + * lib/uchar.in.h (c32stombs): Likewise. + * modules/c32stombs (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> mbstoc32s: Optimize clearing an mbstate_t. diff --git a/lib/c32stombs.c b/lib/c32stombs.c index 86e84b2fe7..17a4e1ab61 100644 --- a/lib/c32stombs.c +++ b/lib/c32stombs.c @@ -32,6 +32,6 @@ c32stombs (char *dest, const char32_t *src, size_t len) { mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); return c32srtombs (dest, &src, len, &state); } diff --git a/lib/uchar.in.h b/lib/uchar.in.h index b3fa531d08..15c4818aed 100644 --- a/lib/uchar.in.h +++ b/lib/uchar.in.h @@ -482,7 +482,7 @@ c32stombs (char *dest, const char32_t *src, size_t len) { mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); return c32srtombs (dest, &src, len, &state); } _GL_END_C_LINKAGE diff --git a/modules/c32stombs b/modules/c32stombs index 1067c433af..103c8758c0 100644 --- a/modules/c32stombs +++ b/modules/c32stombs @@ -7,6 +7,7 @@ lib/c32stombs.c Depends-on: uchar wchar +mbszero c32srtombs configure.ac: -- 2.34.1
>From 0c3084c97be5bba6456bc8f07c6d4cbb0451c4cd Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:43 +0200 Subject: [PATCH 11/19] mbiter: Optimize clearing an mbstate_t. * lib/mbiter.h: Include <wchar.h>. (mbiter_multi_next, mbiter_multi_copy, mbi_init): Use mbszero. * modules/mbiter (Depends-on): Add mbszero. --- ChangeLog | 7 +++++++ lib/mbiter.h | 9 +++++---- modules/mbiter | 1 + 3 files changed, 13 insertions(+), 4 deletions(-) diff --git a/ChangeLog b/ChangeLog index d89119176f..c14b35b2d3 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,10 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + mbiter: Optimize clearing an mbstate_t. + * lib/mbiter.h: Include <wchar.h>. + (mbiter_multi_next, mbiter_multi_copy, mbi_init): Use mbszero. + * modules/mbiter (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> c32stombs: Optimize clearing an mbstate_t. diff --git a/lib/mbiter.h b/lib/mbiter.h index e5d7241036..c9c18df05d 100644 --- a/lib/mbiter.h +++ b/lib/mbiter.h @@ -91,6 +91,7 @@ #include <stddef.h> #include <string.h> #include <uchar.h> +#include <wchar.h> #include "mbchar.h" @@ -161,7 +162,7 @@ mbiter_multi_next (struct mbiter_multi *iter) #if !GNULIB_MBRTOC32_REGULAR iter->in_shift = false; #endif - memset (&iter->state, '\0', sizeof (mbstate_t)); + mbszero (&iter->state); } else if (iter->cur.bytes == (size_t) -2) { @@ -219,7 +220,7 @@ mbiter_multi_copy (struct mbiter_multi *new_iter, const struct mbiter_multi *old memcpy (&new_iter->state, &old_iter->state, sizeof (mbstate_t)); else #endif - memset (&new_iter->state, 0, sizeof (mbstate_t)); + mbszero (&new_iter->state); new_iter->next_done = old_iter->next_done; mb_copy (&new_iter->cur, &old_iter->cur); } @@ -229,13 +230,13 @@ typedef struct mbiter_multi mbi_iterator_t; #if !GNULIB_MBRTOC32_REGULAR #define mbi_init(iter, startptr, length) \ ((iter).cur.ptr = (startptr), (iter).limit = (iter).cur.ptr + (length), \ - (iter).in_shift = false, memset (&(iter).state, '\0', sizeof (mbstate_t)), \ + (iter).in_shift = false, mbszero (&(iter).state), \ (iter).next_done = false) #else /* Optimized: no in_shift. */ #define mbi_init(iter, startptr, length) \ ((iter).cur.ptr = (startptr), (iter).limit = (iter).cur.ptr + (length), \ - memset (&(iter).state, '\0', sizeof (mbstate_t)), \ + mbszero (&(iter).state), \ (iter).next_done = false) #endif #if !GNULIB_MBRTOC32_REGULAR diff --git a/modules/mbiter b/modules/mbiter index 1d50d1148e..29d217f194 100644 --- a/modules/mbiter +++ b/modules/mbiter @@ -12,6 +12,7 @@ extern-inline mbchar mbrtoc32 mbsinit +mbszero uchar stdbool -- 2.34.1
>From 9cfa58a7d31c93196ce36f134224526666de77a7 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:46 +0200 Subject: [PATCH 12/19] mbuiter: Optimize clearing an mbstate_t. * lib/mbuiter.h: Include <wchar.h>. (mbuiter_multi_next, mbuiter_multi_copy, mbui_init): Use mbszero. * modules/mbuiter (Depends-on): Add mbszero. --- ChangeLog | 7 +++++++ lib/mbuiter.h | 9 +++++---- modules/mbuiter | 1 + 3 files changed, 13 insertions(+), 4 deletions(-) diff --git a/ChangeLog b/ChangeLog index c14b35b2d3..bdebe0b769 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,10 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + mbuiter: Optimize clearing an mbstate_t. + * lib/mbuiter.h: Include <wchar.h>. + (mbuiter_multi_next, mbuiter_multi_copy, mbui_init): Use mbszero. + * modules/mbuiter (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> mbiter: Optimize clearing an mbstate_t. diff --git a/lib/mbuiter.h b/lib/mbuiter.h index e78ffa46ae..6a06032941 100644 --- a/lib/mbuiter.h +++ b/lib/mbuiter.h @@ -99,6 +99,7 @@ #include <stdlib.h> #include <string.h> #include <uchar.h> +#include <wchar.h> #include "mbchar.h" #include "strnlen1.h" @@ -170,7 +171,7 @@ mbuiter_multi_next (struct mbuiter_multi *iter) #if !GNULIB_MBRTOC32_REGULAR iter->in_shift = false; #endif - memset (&iter->state, '\0', sizeof (mbstate_t)); + mbszero (&iter->state); } else if (iter->cur.bytes == (size_t) -2) { @@ -222,7 +223,7 @@ mbuiter_multi_copy (struct mbuiter_multi *new_iter, const struct mbuiter_multi * memcpy (&new_iter->state, &old_iter->state, sizeof (mbstate_t)); else #endif - memset (&new_iter->state, 0, sizeof (mbstate_t)); + mbszero (&new_iter->state); new_iter->next_done = old_iter->next_done; mb_copy (&new_iter->cur, &old_iter->cur); } @@ -232,13 +233,13 @@ typedef struct mbuiter_multi mbui_iterator_t; #if !GNULIB_MBRTOC32_REGULAR #define mbui_init(iter, startptr) \ ((iter).cur.ptr = (startptr), \ - (iter).in_shift = false, memset (&(iter).state, '\0', sizeof (mbstate_t)), \ + (iter).in_shift = false, mbszero (&(iter).state), \ (iter).next_done = false) #else /* Optimized: no in_shift. */ #define mbui_init(iter, startptr) \ ((iter).cur.ptr = (startptr), \ - memset (&(iter).state, '\0', sizeof (mbstate_t)), \ + mbszero (&(iter).state), \ (iter).next_done = false) #endif #define mbui_avail(iter) \ diff --git a/modules/mbuiter b/modules/mbuiter index 23b1b20d04..53a1d13dc8 100644 --- a/modules/mbuiter +++ b/modules/mbuiter @@ -12,6 +12,7 @@ extern-inline mbchar mbrtoc32 mbsinit +mbszero uchar stdbool strnlen1 -- 2.34.1
>From 948151dcf1e1a55076831493eddd67e0d5b8e8b8 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:48 +0200 Subject: [PATCH 13/19] mbfile: Optimize clearing an mbstate_t. * lib/mbfile.h (mbfile_multi_getc, mbf_init): Use mbszero. * modules/mbfile (Depends-on): Add mbszero. --- ChangeLog | 6 ++++++ lib/mbfile.h | 4 ++-- modules/mbfile | 1 + 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/ChangeLog b/ChangeLog index bdebe0b769..946725e139 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + mbfile: Optimize clearing an mbstate_t. + * lib/mbfile.h (mbfile_multi_getc, mbf_init): Use mbszero. + * modules/mbfile (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> mbuiter: Optimize clearing an mbstate_t. diff --git a/lib/mbfile.h b/lib/mbfile.h index 74d4986577..dccdedb03c 100644 --- a/lib/mbfile.h +++ b/lib/mbfile.h @@ -150,7 +150,7 @@ mbfile_multi_getc (struct mbchar *mbc, struct mbfile_multi *mbf) bytes = 1; mbc->wc_valid = false; /* Allow the next invocation to continue from a sane state. */ - memset (&mbf->state, '\0', sizeof (mbstate_t)); + mbszero (&mbf->state); break; } else if (bytes == (size_t) -2) @@ -244,7 +244,7 @@ typedef mbchar_t mbf_char_t; ((mbf).fp = (stream), \ (mbf).eof_seen = false, \ (mbf).have_pushback = false, \ - memset (&(mbf).state, '\0', sizeof (mbstate_t)), \ + mbszero (&(mbf).state), \ (mbf).bufcount = 0) #define mbf_getc(mbc, mbf) mbfile_multi_getc (&(mbc), &(mbf)) diff --git a/modules/mbfile b/modules/mbfile index c446e37e17..af3674ea79 100644 --- a/modules/mbfile +++ b/modules/mbfile @@ -12,6 +12,7 @@ extern-inline mbchar mbrtowc mbsinit +mbszero wchar stdbool -- 2.34.1
>From a7adefb5337ac617dae143966f563ec325352038 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:50 +0200 Subject: [PATCH 14/19] mbswidth: Optimize clearing an mbstate_t. * lib/mbswidth.c (mbsnwidth): Use mbszero. * modules/mbswidth (Depends-on): Add mbszero. --- ChangeLog | 6 ++++++ lib/mbswidth.c | 2 +- modules/mbswidth | 1 + 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index 946725e139..49d27cc084 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + mbswidth: Optimize clearing an mbstate_t. + * lib/mbswidth.c (mbsnwidth): Use mbszero. + * modules/mbswidth (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> mbfile: Optimize clearing an mbstate_t. diff --git a/lib/mbswidth.c b/lib/mbswidth.c index a1613dcad6..9ce94ae80d 100644 --- a/lib/mbswidth.c +++ b/lib/mbswidth.c @@ -94,7 +94,7 @@ mbsnwidth (const char *string, size_t nbytes, int flags) /* If we have a multibyte sequence, scan it up to its end. */ { mbstate_t mbstate; - memset (&mbstate, 0, sizeof mbstate); + mbszero (&mbstate); for (;;) { char32_t wc; diff --git a/modules/mbswidth b/modules/mbswidth index 4dd8c55ee4..3c2e17dc06 100644 --- a/modules/mbswidth +++ b/modules/mbswidth @@ -13,6 +13,7 @@ wchar uchar mbrtoc32 mbsinit +mbszero c32width c32iscntrl extensions -- 2.34.1
>From e3d567d5822f59cffe68adef5321f5dbaca09c53 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:52 +0200 Subject: [PATCH 15/19] mbmemcasecoll: Optimize clearing an mbstate_t. * lib/mbmemcasecoll.c (apply_c32tolower): Use mbszero. * modules/mbmemcasecoll (Depends-on): Add mbszero. --- ChangeLog | 6 ++++++ lib/mbmemcasecoll.c | 4 ++-- modules/mbmemcasecoll | 1 + 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/ChangeLog b/ChangeLog index 49d27cc084..80f6503b16 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + mbmemcasecoll: Optimize clearing an mbstate_t. + * lib/mbmemcasecoll.c (apply_c32tolower): Use mbszero. + * modules/mbmemcasecoll (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> mbswidth: Optimize clearing an mbstate_t. diff --git a/lib/mbmemcasecoll.c b/lib/mbmemcasecoll.c index 0b765ff05f..08ea1df6bb 100644 --- a/lib/mbmemcasecoll.c +++ b/lib/mbmemcasecoll.c @@ -53,7 +53,7 @@ apply_c32tolower (const char *inbuf, size_t inbufsize, { mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); for (;;) { char32_t wc1; @@ -97,7 +97,7 @@ apply_c32tolower (const char *inbuf, size_t inbufsize, mbstate_t state2; size_t n2; - memset (&state2, '\0', sizeof (mbstate_t)); + mbszero (&state2); n2 = c32rtomb (outbuf, wc2, &state2); if (n2 != (size_t)(-1)) { diff --git a/modules/mbmemcasecoll b/modules/mbmemcasecoll index ee52717f3e..7231db71ac 100644 --- a/modules/mbmemcasecoll +++ b/modules/mbmemcasecoll @@ -11,6 +11,7 @@ stdbool wchar uchar malloca +mbszero mbrtoc32 c32rtomb c32tolower -- 2.34.1
>From dad9ed27ab75a4e6aa39205d19a6978ba15fc3ab Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:55 +0200 Subject: [PATCH 16/19] vasnprintf, vasnwprintf: Optimize clearing an mbstate_t. * lib/vasnprintf.c (VASNPRINTF): Use mbszero. * modules/vasnprintf (Depends-on): Add mbszero. * modules/vasnwprintf (Depends-on): Likewise. * modules/c-vasnprintf (Depends-on): Likewise. * modules/unistdio/u8-vasnprintf (Depends-on): Likewise. * modules/unistdio/u8-u8-vasnprintf (Depends-on): Likewise. * modules/unistdio/u16-vasnprintf (Depends-on): Likewise. * modules/unistdio/u16-u16-vasnprintf (Depends-on): Likewise. * modules/unistdio/u32-vasnprintf (Depends-on): Likewise. * modules/unistdio/u32-u32-vasnprintf (Depends-on): Likewise. * modules/unistdio/ulc-vasnprintf (Depends-on): Likewise. --- ChangeLog | 15 +++++++++++++ lib/vasnprintf.c | 34 ++++++++++++++--------------- modules/c-vasnprintf | 1 + modules/unistdio/u16-u16-vasnprintf | 1 + modules/unistdio/u16-vasnprintf | 1 + modules/unistdio/u32-u32-vasnprintf | 1 + modules/unistdio/u32-vasnprintf | 1 + modules/unistdio/u8-u8-vasnprintf | 1 + modules/unistdio/u8-vasnprintf | 1 + modules/unistdio/ulc-vasnprintf | 1 + modules/vasnprintf | 1 + modules/vasnwprintf | 1 + 12 files changed, 42 insertions(+), 17 deletions(-) diff --git a/ChangeLog b/ChangeLog index 80f6503b16..bf52c82001 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,18 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + vasnprintf, vasnwprintf: Optimize clearing an mbstate_t. + * lib/vasnprintf.c (VASNPRINTF): Use mbszero. + * modules/vasnprintf (Depends-on): Add mbszero. + * modules/vasnwprintf (Depends-on): Likewise. + * modules/c-vasnprintf (Depends-on): Likewise. + * modules/unistdio/u8-vasnprintf (Depends-on): Likewise. + * modules/unistdio/u8-u8-vasnprintf (Depends-on): Likewise. + * modules/unistdio/u16-vasnprintf (Depends-on): Likewise. + * modules/unistdio/u16-u16-vasnprintf (Depends-on): Likewise. + * modules/unistdio/u32-vasnprintf (Depends-on): Likewise. + * modules/unistdio/u32-u32-vasnprintf (Depends-on): Likewise. + * modules/unistdio/ulc-vasnprintf (Depends-on): Likewise. + 2023-07-16 Bruno Haible <br...@clisp.org> mbmemcasecoll: Optimize clearing an mbstate_t. diff --git a/lib/vasnprintf.c b/lib/vasnprintf.c index 9ad31b2a08..2d9aa977ec 100644 --- a/lib/vasnprintf.c +++ b/lib/vasnprintf.c @@ -83,7 +83,7 @@ #include <stdio.h> /* snprintf(), sprintf() */ #include <stdlib.h> /* abort(), malloc(), realloc(), free() */ #include <string.h> /* memcpy(), strlen() */ -#include <wchar.h> /* mbstate_t, mbrtowc(), mbrlen(), wcrtomb() */ +#include <wchar.h> /* mbstate_t, mbrtowc(), mbrlen(), wcrtomb(), mbszero() */ #include <errno.h> /* errno */ #include <limits.h> /* CHAR_BIT, INT_WIDTH, LONG_WIDTH */ #include <float.h> /* DBL_MAX_EXP, LDBL_MAX_EXP */ @@ -3007,7 +3007,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, wide characters, from the left. */ # if HAVE_MBRTOWC mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); # endif arg_end = arg; characters = 0; @@ -3035,7 +3035,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, characters. */ # if HAVE_MBRTOWC mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); # endif arg_end = arg; characters = 0; @@ -3079,7 +3079,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, size_t remaining; # if HAVE_MBRTOWC mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); # endif ENSURE_ALLOCATION (xsum (length, characters)); for (remaining = characters; remaining > 0; remaining--) @@ -3105,7 +3105,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, { # if HAVE_MBRTOWC mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); # endif while (arg < arg_end) { @@ -3157,7 +3157,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, at most PRECISION bytes, from the left. */ # if HAVE_WCRTOMB && !defined GNULIB_defined_mbstate_t mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); # endif arg_end = arg; characters = 0; @@ -3190,7 +3190,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, bytes. */ # if HAVE_WCRTOMB && !defined GNULIB_defined_mbstate_t mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); # endif arg_end = arg; characters = 0; @@ -3230,7 +3230,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, size_t remaining; # if HAVE_WCRTOMB && !defined GNULIB_defined_mbstate_t mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); # endif for (remaining = characters; remaining > 0; ) { @@ -3299,7 +3299,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, size_t remaining; # if HAVE_WCRTOMB && !defined GNULIB_defined_mbstate_t mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); # endif ENSURE_ALLOCATION (xsum (length, characters)); for (remaining = characters; remaining > 0; ) @@ -3325,7 +3325,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, { # if HAVE_WCRTOMB && !defined GNULIB_defined_mbstate_t mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); # endif while (arg < arg_end) { @@ -3430,7 +3430,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, int count; # if HAVE_WCRTOMB && !defined GNULIB_defined_mbstate_t mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); # endif count = local_wcrtomb (cbuf, arg, &state); @@ -3456,7 +3456,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, int count; # if HAVE_WCRTOMB && !defined GNULIB_defined_mbstate_t mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); # endif count = local_wcrtomb (cbuf, arg, &state); @@ -3512,7 +3512,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, int count; # if HAVE_WCRTOMB && !defined GNULIB_defined_mbstate_t mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); # endif count = local_wcrtomb (result + length, arg, &state); @@ -3530,7 +3530,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, int count; # if HAVE_WCRTOMB && !defined GNULIB_defined_mbstate_t mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); # endif count = local_wcrtomb (cbuf, arg, &state); @@ -3604,7 +3604,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, mbstate_t state; wchar_t wc; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); int count = mbrtowc (&wc, &arg, 1, &state); if (count < 0) /* Invalid or incomplete multibyte character. */ @@ -6602,7 +6602,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, wide character array. */ mbstate_t state; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); tmpdst_len = 0; { const TCHAR_T *src = tmpsrc; @@ -6626,7 +6626,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, if (tmpdst == NULL) goto out_of_memory; - memset (&state, '\0', sizeof (mbstate_t)); + mbszero (&state); { DCHAR_T *destptr = tmpdst; const TCHAR_T *src = tmpsrc; diff --git a/modules/c-vasnprintf b/modules/c-vasnprintf index 7ccd83c82e..614e8346a7 100644 --- a/modules/c-vasnprintf +++ b/modules/c-vasnprintf @@ -42,6 +42,7 @@ xsize errno memchr multiarch +mbszero configure.ac: AC_REQUIRE([AC_C_RESTRICT]) diff --git a/modules/unistdio/u16-u16-vasnprintf b/modules/unistdio/u16-u16-vasnprintf index a6fd0ac0fa..09a1c55545 100644 --- a/modules/unistdio/u16-u16-vasnprintf +++ b/modules/unistdio/u16-u16-vasnprintf @@ -48,6 +48,7 @@ free-posix memchr multiarch assert-h +mbszero configure.ac: gl_PREREQ_VASNPRINTF_WITH_POSIX_EXTRAS diff --git a/modules/unistdio/u16-vasnprintf b/modules/unistdio/u16-vasnprintf index 58117892ef..bd5b525f79 100644 --- a/modules/unistdio/u16-vasnprintf +++ b/modules/unistdio/u16-vasnprintf @@ -48,6 +48,7 @@ free-posix memchr multiarch assert-h +mbszero configure.ac: gl_PREREQ_VASNPRINTF_WITH_POSIX_EXTRAS diff --git a/modules/unistdio/u32-u32-vasnprintf b/modules/unistdio/u32-u32-vasnprintf index bda4dab407..c2a975da87 100644 --- a/modules/unistdio/u32-u32-vasnprintf +++ b/modules/unistdio/u32-u32-vasnprintf @@ -48,6 +48,7 @@ free-posix memchr multiarch assert-h +mbszero configure.ac: gl_PREREQ_VASNPRINTF_WITH_POSIX_EXTRAS diff --git a/modules/unistdio/u32-vasnprintf b/modules/unistdio/u32-vasnprintf index 50168e371c..bff21f35ce 100644 --- a/modules/unistdio/u32-vasnprintf +++ b/modules/unistdio/u32-vasnprintf @@ -48,6 +48,7 @@ free-posix memchr multiarch assert-h +mbszero configure.ac: gl_PREREQ_VASNPRINTF_WITH_POSIX_EXTRAS diff --git a/modules/unistdio/u8-u8-vasnprintf b/modules/unistdio/u8-u8-vasnprintf index 18f6e07a36..07124cc6cb 100644 --- a/modules/unistdio/u8-u8-vasnprintf +++ b/modules/unistdio/u8-u8-vasnprintf @@ -48,6 +48,7 @@ free-posix memchr multiarch assert-h +mbszero configure.ac: gl_PREREQ_VASNPRINTF_WITH_POSIX_EXTRAS diff --git a/modules/unistdio/u8-vasnprintf b/modules/unistdio/u8-vasnprintf index e71fb2f16c..69e52ecf9f 100644 --- a/modules/unistdio/u8-vasnprintf +++ b/modules/unistdio/u8-vasnprintf @@ -48,6 +48,7 @@ free-posix memchr multiarch assert-h +mbszero configure.ac: gl_PREREQ_VASNPRINTF_WITH_POSIX_EXTRAS diff --git a/modules/unistdio/ulc-vasnprintf b/modules/unistdio/ulc-vasnprintf index 8b43a8488a..e535d18985 100644 --- a/modules/unistdio/ulc-vasnprintf +++ b/modules/unistdio/ulc-vasnprintf @@ -46,6 +46,7 @@ free-posix memchr multiarch assert-h +mbszero configure.ac: gl_PREREQ_VASNPRINTF_WITH_POSIX_EXTRAS diff --git a/modules/vasnprintf b/modules/vasnprintf index c87fe7a106..b436430520 100644 --- a/modules/vasnprintf +++ b/modules/vasnprintf @@ -33,6 +33,7 @@ errno memchr assert-h wchar +mbszero configure.ac: AC_REQUIRE([AC_C_RESTRICT]) diff --git a/modules/vasnwprintf b/modules/vasnwprintf index 70f3fb2c95..67143e7455 100644 --- a/modules/vasnwprintf +++ b/modules/vasnwprintf @@ -37,6 +37,7 @@ errno memchr assert-h wchar +mbszero mbrtowc wmemcpy wmemset -- 2.34.1
>From a85a9b34bd5ccd4cfe2f0c3c445199a91517814a Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:57 +0200 Subject: [PATCH 17/19] quotearg: Optimize clearing an mbstate_t. * lib/quotearg.c: Include <wchar.h>. (quotearg_buffer_restyled): Use mbszero. * modules/quotearg (Depends-on): Add mbszero. --- ChangeLog | 7 +++++++ lib/quotearg.c | 3 ++- modules/quotearg | 1 + 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index bf52c82001..7371228547 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,10 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + quotearg: Optimize clearing an mbstate_t. + * lib/quotearg.c: Include <wchar.h>. + (quotearg_buffer_restyled): Use mbszero. + * modules/quotearg (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> vasnprintf, vasnwprintf: Optimize clearing an mbstate_t. diff --git a/lib/quotearg.c b/lib/quotearg.c index 5b26055b2e..8e500d9b67 100644 --- a/lib/quotearg.c +++ b/lib/quotearg.c @@ -42,6 +42,7 @@ #include <stdlib.h> #include <string.h> #include <uchar.h> +#include <wchar.h> #include "gettext.h" #define _(msgid) gettext (msgid) @@ -607,7 +608,7 @@ quotearg_buffer_restyled (char *buffer, size_t buffersize, else { mbstate_t mbstate; - memset (&mbstate, 0, sizeof mbstate); + mbszero (&mbstate); m = 0; printable = true; diff --git a/modules/quotearg b/modules/quotearg index 6e823fa9c6..89f81fd55f 100644 --- a/modules/quotearg +++ b/modules/quotearg @@ -13,6 +13,7 @@ c-strcaseeq c32isprint extensions gettext-h +mbszero mbrtoc32 mbsinit memcmp -- 2.34.1
>From 13cd6eb1e9f6d0a540727616ba6843648e74dc15 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:30:59 +0200 Subject: [PATCH 18/19] uchar-c23: Optimize clearing an mbstate_t. * lib/lc-charset-unicode.c (locale_encoding_to_unicode, unicode_to_locale_encoding): Use mbszero. * modules/uchar-c23 (Depends-on): Add mbszero. --- ChangeLog | 7 +++++++ lib/lc-charset-unicode.c | 6 ++++-- modules/uchar-c23 | 1 + 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/ChangeLog b/ChangeLog index 7371228547..567c971b6f 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,10 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + uchar-c23: Optimize clearing an mbstate_t. + * lib/lc-charset-unicode.c (locale_encoding_to_unicode, + unicode_to_locale_encoding): Use mbszero. + * modules/uchar-c23 (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> quotearg: Optimize clearing an mbstate_t. diff --git a/lib/lc-charset-unicode.c b/lib/lc-charset-unicode.c index afbc188ad7..20994618b4 100644 --- a/lib/lc-charset-unicode.c +++ b/lib/lc-charset-unicode.c @@ -158,7 +158,8 @@ locale_encoding_to_unicode (wchar_t wc) char mbbuf[64]; size_t mbcnt; { - mbstate_t state = { 0 }; + mbstate_t state; + mbszero (&state); mbcnt = wcrtomb (mbbuf, wc, &state); if (mbcnt > sizeof (mbbuf)) /* wcrtomb did not recognize the wide character wc. */ @@ -248,7 +249,8 @@ unicode_to_locale_encoding (char32_t uc) wchar_t wc; { - mbstate_t state = { 0 }; + mbstate_t state; + mbszero (&state); if (mbrtowc (&wc, mbbuf, mbcnt, &state) != mbcnt) /* iconv produced an invalid multibyte sequence. */ return 0; diff --git a/modules/uchar-c23 b/modules/uchar-c23 index 11e08652cb..5fcd2802bf 100644 --- a/modules/uchar-c23 +++ b/modules/uchar-c23 @@ -13,6 +13,7 @@ localcharset streq lock tls +mbszero wcrtomb unistr/u8-mbtouc unistr/u8-uctomb -- 2.34.1
>From 776c52c2f432891e2a40d3be889683f1cc2d3452 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Sun, 16 Jul 2023 07:31:02 +0200 Subject: [PATCH 19/19] dfa: Optimize clearing an mbstate_t. * lib/dfa.c (mbszero) [GAWK]: Add fallback definition. (mbs_to_wchar, lex, addtok_wc, dfaexec_main): Use mbszero. * modules/dfa (Depends-on): Add mbszero. --- ChangeLog | 7 +++++++ lib/dfa.c | 11 +++++++---- modules/dfa | 1 + 3 files changed, 15 insertions(+), 4 deletions(-) diff --git a/ChangeLog b/ChangeLog index 567c971b6f..2caecc594a 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,10 @@ +2023-07-16 Bruno Haible <br...@clisp.org> + + dfa: Optimize clearing an mbstate_t. + * lib/dfa.c (mbszero) [GAWK]: Add fallback definition. + (mbs_to_wchar, lex, addtok_wc, dfaexec_main): Use mbszero. + * modules/dfa (Depends-on): Add mbszero. + 2023-07-16 Bruno Haible <br...@clisp.org> uchar-c23: Optimize clearing an mbstate_t. diff --git a/lib/dfa.c b/lib/dfa.c index 5cf76ecf08..ff0f53348c 100644 --- a/lib/dfa.c +++ b/lib/dfa.c @@ -52,6 +52,7 @@ # define c32tob wctob # define c32isprint iswprint # define c32isspace iswspace +# define mbszero(p) memset ((p), 0, sizeof (mbstate_t)) #else /* Use ISO C 11 + gnulib API. */ # include <uchar.h> @@ -661,7 +662,7 @@ mbs_to_wchar (wint_t *pwc, char const *s, idx_t n, struct dfa *d) 'mbrtoc32-regular' module. */ return nbytes; } - memset (&d->mbs, 0, sizeof d->mbs); + mbszero (&d->mbs); } *pwc = wc; @@ -1585,7 +1586,8 @@ lex (struct dfa *dfa) else { char buf[MB_LEN_MAX + 1]; - mbstate_t s = { 0 }; + mbstate_t s; + mbszero (&s); size_t stored_bytes = c32rtomb (buf, dfa->lex.wctok, &s); if (stored_bytes < (size_t) -1) { @@ -1722,7 +1724,8 @@ static void addtok_wc (struct dfa *dfa, wint_t wc) { unsigned char buf[MB_LEN_MAX]; - mbstate_t s = { 0 }; + mbstate_t s; + mbszero (&s); size_t stored_bytes = c32rtomb ((char *) buf, wc, &s); int buflen; @@ -3495,7 +3498,7 @@ dfaexec_main (struct dfa *d, char const *begin, char *end, bool allow_nl, if (multibyte) { - memset (&d->mbs, 0, sizeof d->mbs); + mbszero (&d->mbs); if (d->mb_follows.alloc == 0) alloc_position_set (&d->mb_follows, d->nleaves); } diff --git a/modules/dfa b/modules/dfa index 57094f590b..b41eaca8dd 100644 --- a/modules/dfa +++ b/modules/dfa @@ -21,6 +21,7 @@ flexmember idx locale mbrtoc32-regular +mbszero regex stdbool stddef -- 2.34.1