Unicode support in poke
Hi José, Yesterday, you identified a set of functions from GNU libunistring that would be useful to use in GNU poke. Since you will need only a few such functions, which sums up to little code and only one big table, you can take the respective modules from gnulib - a regular use of gnulib-tool. All the source code of libunistring is in gnulib, distributed across ca. 350 modules. The list of modules is as follows. The relation between function name and module name is obvious. unitypes unistr/base unistr/u8-check unistr/u8-chr unistr/u8-cmp unistr/u8-cmp2 unistr/u8-cpy unistr/u8-cpy-alloc unistr/u8-endswith unistr/u8-mblen unistr/u8-mbsnlen unistr/u8-mbtouc unistr/u8-mbtoucr unistr/u8-mbtouc-unsafe unistr/u8-move unistr/u8-next unistr/u8-prev unistr/u8-set unistr/u8-startswith unistr/u8-stpcpy unistr/u8-stpncpy unistr/u8-strcat unistr/u8-strchr unistr/u8-strcmp unistr/u8-strcoll unistr/u8-strcpy unistr/u8-strcspn unistr/u8-strdup unistr/u8-strlen unistr/u8-strmblen unistr/u8-strmbtouc unistr/u8-strncat unistr/u8-strncmp unistr/u8-strncpy unistr/u8-strnlen unistr/u8-strpbrk unistr/u8-strrchr unistr/u8-strspn unistr/u8-strstr unistr/u8-strtok unistr/u8-to-u16 unistr/u8-to-u32 unistr/u8-uctomb unistr/u16-check unistr/u16-chr unistr/u16-cmp unistr/u16-cmp2 unistr/u16-cpy unistr/u16-cpy-alloc unistr/u16-endswith unistr/u16-mblen unistr/u16-mbsnlen unistr/u16-mbtouc unistr/u16-mbtoucr unistr/u16-mbtouc-unsafe unistr/u16-move unistr/u16-next unistr/u16-prev unistr/u16-set unistr/u16-startswith unistr/u16-stpcpy unistr/u16-stpncpy unistr/u16-strcat unistr/u16-strchr unistr/u16-strcmp unistr/u16-strcoll unistr/u16-strcpy unistr/u16-strcspn unistr/u16-strdup unistr/u16-strlen unistr/u16-strmblen unistr/u16-strmbtouc unistr/u16-strncat unistr/u16-strncmp unistr/u16-strncpy unistr/u16-strnlen unistr/u16-strpbrk unistr/u16-strrchr unistr/u16-strspn unistr/u16-strstr unistr/u16-strtok unistr/u16-to-u32 unistr/u16-to-u8 unistr/u16-uctomb unistr/u32-check unistr/u32-chr unistr/u32-cmp unistr/u32-cmp2 unistr/u32-cpy unistr/u32-cpy-alloc unistr/u32-endswith unistr/u32-mblen unistr/u32-mbsnlen unistr/u32-mbtouc unistr/u32-mbtoucr unistr/u32-mbtouc-unsafe unistr/u32-move unistr/u32-next unistr/u32-prev unistr/u32-set unistr/u32-startswith unistr/u32-stpcpy unistr/u32-stpncpy unistr/u32-strcat unistr/u32-strchr unistr/u32-strcmp unistr/u32-strcoll unistr/u32-strcpy unistr/u32-strcspn unistr/u32-strdup unistr/u32-strlen unistr/u32-strmblen unistr/u32-strmbtouc unistr/u32-strncat unistr/u32-strncmp unistr/u32-strncpy unistr/u32-strnlen unistr/u32-strpbrk unistr/u32-strrchr unistr/u32-strspn unistr/u32-strstr unistr/u32-strtok unistr/u32-to-u16 unistr/u32-to-u8 unistr/u32-uctomb uniconv/base uniconv/u8-conv-from-enc uniconv/u8-conv-to-enc uniconv/u8-strconv-from-enc uniconv/u8-strconv-from-locale uniconv/u8-strconv-to-enc uniconv/u8-strconv-to-locale uniconv/u16-conv-from-enc uniconv/u16-conv-to-enc uniconv/u16-strconv-from-enc uniconv/u16-strconv-from-locale uniconv/u16-strconv-to-enc uniconv/u16-strconv-to-locale uniconv/u32-conv-from-enc uniconv/u32-conv-to-enc uniconv/u32-strconv-from-enc uniconv/u32-strconv-from-locale uniconv/u32-strconv-to-enc uniconv/u32-strconv-to-locale unistdio/base unistdio/u8-asnprintf unistdio/u8-asprintf unistdio/u8-snprintf unistdio/u8-sprintf unistdio/u8-u8-asnprintf unistdio/u8-u8-asprintf unistdio/u8-u8-snprintf unistdio/u8-u8-sprintf unistdio/u8-u8-vasnprintf unistdio/u8-u8-vasprintf unistdio/u8-u8-vsnprintf unistdio/u8-u8-vsprintf unistdio/u8-vasnprintf unistdio/u8-vasprintf unistdio/u8-vsnprintf unistdio/u8-vsprintf unistdio/u16-asnprintf unistdio/u16-asprintf unistdio/u16-snprintf unistdio/u16-sprintf unistdio/u16-u16-asnprintf unistdio/u16-u16-asprintf unistdio/u16-u16-snprintf unistdio/u16-u16-sprintf unistdio/u16-u16-vasnprintf unistdio/u16-u16-vasprintf unistdio/u16-u16-vsnprintf unistdio/u16-u16-vsprintf unistdio/u16-vasnprintf unistdio/u16-vasprintf unistdio/u16-vsnprintf unistdio/u16-vsprintf unistdio/u32-asnprintf unistdio/u32-asprintf unistdio/u32-snprintf unistdio/u32-sprintf unistdio/u32-u32-asnprintf unistdio/u32-u32-asprintf unistdio/u32-u32-snprintf unistdio/u32-u32-sprintf unistdio/u32-u32-vasnprintf unistdio/u32-u32-vasprintf unistdio/
Re: Unicode support in poke
Hi José, you could look at libidn2 as an example how to use system libunistring if there (or if new enough) and fallback to gnulib unistring. (BTW, libunistring is made of the gnulib unistring modules) It creates a separate dir / library for gnulib unistring functions, *BUT* only uses it when a system libunistring can't be found. bootstrap.conf: Call gnulib-tool in bootstrap_post_import_hook() only for the needed unistring modules. configure.ac: Check for system libunistring (set a conditional HAVE_LIBUNISTRING Makefile.am: if !HAVE_LIBUNISTRING -> add unistring/ to SUBDIR */Makefile.am: if !HAVE_LIBUNISTRING -> add unistring/ to includes in AM_CPPFLAGS I think, that's all. Regards, Tim On 1/13/20 11:33 AM, Bruno Haible wrote: > Hi José, > > Yesterday, you identified a set of functions from GNU libunistring that would > be useful to use in GNU poke. Since you will need only a few such functions, > which sums up to little code and only one big table, you can take the > respective modules from gnulib - a regular use of gnulib-tool. All the source > code of libunistring is in gnulib, distributed across ca. 350 modules. > > The list of modules is as follows. The relation between function name and > module name is obvious. > > unitypes > unistr/base > unistr/u8-check > unistr/u8-chr > unistr/u8-cmp > unistr/u8-cmp2 > unistr/u8-cpy > unistr/u8-cpy-alloc > unistr/u8-endswith > unistr/u8-mblen > unistr/u8-mbsnlen > unistr/u8-mbtouc > unistr/u8-mbtoucr > unistr/u8-mbtouc-unsafe > unistr/u8-move > unistr/u8-next > unistr/u8-prev > unistr/u8-set > unistr/u8-startswith > unistr/u8-stpcpy > unistr/u8-stpncpy > unistr/u8-strcat > unistr/u8-strchr > unistr/u8-strcmp > unistr/u8-strcoll > unistr/u8-strcpy > unistr/u8-strcspn > unistr/u8-strdup > unistr/u8-strlen > unistr/u8-strmblen > unistr/u8-strmbtouc > unistr/u8-strncat > unistr/u8-strncmp > unistr/u8-strncpy > unistr/u8-strnlen > unistr/u8-strpbrk > unistr/u8-strrchr > unistr/u8-strspn > unistr/u8-strstr > unistr/u8-strtok > unistr/u8-to-u16 > unistr/u8-to-u32 > unistr/u8-uctomb > unistr/u16-check > unistr/u16-chr > unistr/u16-cmp > unistr/u16-cmp2 > unistr/u16-cpy > unistr/u16-cpy-alloc > unistr/u16-endswith > unistr/u16-mblen > unistr/u16-mbsnlen > unistr/u16-mbtouc > unistr/u16-mbtoucr > unistr/u16-mbtouc-unsafe > unistr/u16-move > unistr/u16-next > unistr/u16-prev > unistr/u16-set > unistr/u16-startswith > unistr/u16-stpcpy > unistr/u16-stpncpy > unistr/u16-strcat > unistr/u16-strchr > unistr/u16-strcmp > unistr/u16-strcoll > unistr/u16-strcpy > unistr/u16-strcspn > unistr/u16-strdup > unistr/u16-strlen > unistr/u16-strmblen > unistr/u16-strmbtouc > unistr/u16-strncat > unistr/u16-strncmp > unistr/u16-strncpy > unistr/u16-strnlen > unistr/u16-strpbrk > unistr/u16-strrchr > unistr/u16-strspn > unistr/u16-strstr > unistr/u16-strtok > unistr/u16-to-u32 > unistr/u16-to-u8 > unistr/u16-uctomb > unistr/u32-check > unistr/u32-chr > unistr/u32-cmp > unistr/u32-cmp2 > unistr/u32-cpy > unistr/u32-cpy-alloc > unistr/u32-endswith > unistr/u32-mblen > unistr/u32-mbsnlen > unistr/u32-mbtouc > unistr/u32-mbtoucr > unistr/u32-mbtouc-unsafe > unistr/u32-move > unistr/u32-next > unistr/u32-prev > unistr/u32-set > unistr/u32-startswith > unistr/u32-stpcpy > unistr/u32-stpncpy > unistr/u32-strcat > unistr/u32-strchr > unistr/u32-strcmp > unistr/u32-strcoll > unistr/u32-strcpy > unistr/u32-strcspn > unistr/u32-strdup > unistr/u32-strlen > unistr/u32-strmblen > unistr/u32-strmbtouc > unistr/u32-strncat > unistr/u32-strncmp > unistr/u32-strncpy > unistr/u32-strnlen > unistr/u32-strpbrk > unistr/u32-strrchr > unistr/u32-strspn > unistr/u32-strstr > unistr/u32-strtok > unistr/u32-to-u16 > unistr/u32-to-u8 > unistr/u32-uctomb > uniconv/base > uniconv/u8-conv-from-enc > uniconv/u8-conv-to-enc > uniconv/u8-strconv-from-enc > uniconv/u8-strconv-from-locale > uniconv/u8-strconv-to-enc > uniconv/u8-strconv-to-locale > uniconv/u16-conv-from-enc > uniconv/u16-conv-to-enc > uniconv/u16-strconv-from-enc > uniconv/u16-strconv-from-locale > uniconv/u16-strconv-to-enc > uniconv/u16-strconv-to-locale > uniconv/u32-conv-from-enc > uniconv/u32-conv-to-enc > uniconv/u32-strconv-from-enc > uniconv/u32-strconv-from-locale > uniconv/u32-strconv-to-enc > uniconv/u32-strconv-to-locale > unistdio/base > unistdio/u8-asnprintf > unistdio/u8-asprintf > unistdio/u8-snprintf > unistdio/u8-sprintf > unistdio/u8-u8-a
presentation at GHM
At the first GNU Hackers Meeting of 2020 I presented the recent and current work on gnulib. Here are the slides. haible-gnulib-2020.pdf Description: Adobe PDF document
Re: Unicode support in poke
Hi Tim, > you could look at libidn2 as an example how to use system libunistring > if there (or if new enough) and fallback to gnulib unistring. > (BTW, libunistring is made of the gnulib unistring modules) > > It creates a separate dir / library for gnulib unistring functions, > *BUT* only uses it when a system libunistring can't be found. > > bootstrap.conf: Call gnulib-tool in bootstrap_post_import_hook() only > for the needed unistring modules. > > configure.ac: Check for system libunistring (set a conditional > HAVE_LIBUNISTRING > > Makefile.am: if !HAVE_LIBUNISTRING -> add unistring/ to SUBDIR > > */Makefile.am: if !HAVE_LIBUNISTRING -> add unistring/ to includes in > AM_CPPFLAGS A simpler way to achieve the same thing is to include the gnulib module 'libunistring-optional'. It will use the system libunistring if it exists and is new enough, and otherwise compile the respective modules from source. Bruno
Re: Unicode support in poke
Hi Bruno, On 1/13/20 12:01 PM, Bruno Haible wrote: > Hi Tim, > >> you could look at libidn2 as an example how to use system libunistring >> if there (or if new enough) and fallback to gnulib unistring. >> (BTW, libunistring is made of the gnulib unistring modules) >> >> It creates a separate dir / library for gnulib unistring functions, >> *BUT* only uses it when a system libunistring can't be found. >> >> bootstrap.conf: Call gnulib-tool in bootstrap_post_import_hook() only >> for the needed unistring modules. >> >> configure.ac: Check for system libunistring (set a conditional >> HAVE_LIBUNISTRING >> >> Makefile.am: if !HAVE_LIBUNISTRING -> add unistring/ to SUBDIR >> >> */Makefile.am: if !HAVE_LIBUNISTRING -> add unistring/ to includes in >> AM_CPPFLAGS > > A simpler way to achieve the same thing is to include the gnulib module > 'libunistring-optional'. It will use the system libunistring if it > exists and is new enough, and otherwise compile the respective modules > from source. Ah, I didn't know that, thanks. Don't that pull in all the unistring modules (and code) ? At least for building the project. In libidn2 we just use a small subset of the vast functionality of gnulib unistring. In order to keep build overhead small, isn't that the way to go ? I only know that building libunistring takes a while... Regards, Tim signature.asc Description: OpenPGP digital signature
Re: Unicode support in poke
Hi Tim, > > A simpler way to achieve the same thing is to include the gnulib module > > 'libunistring-optional'. It will use the system libunistring if it > > exists and is new enough, and otherwise compile the respective modules > > from source. > > Ah, I didn't know that, thanks. > > Don't that pull in all the unistring modules (and code) ? At least for > building the project. No, it will only pull in the modules that you have requested and their dependencies. Bruno
Re: Messed up gl_COMPILER_PREPARE_CHECK_DECL
Hi Paul, > that incorrect line comes because ac_compile_for_check_decl is used > before it is set. And this occurs because Emacs's configure.ac's first > use of AC_CHECK_DECL is executed only on alpha platforms (which my > platform is not), which means the initialization of > ac_compile_for_check_decl is skipped. The Emacs configure.ac is not written in a robust way. It contains 'case' and 'if'/'test' statements which conditionally execute the Autoconf macros AC_MSG_CHECKING AC_MSG_ERROR AC_MSG_WARN AC_MSG_RESULT AC_CACHE_CHECK AC_LINK_IFELSE AC_LANG_PROGRAM AC_PATH_PROG AC_CHECK_DECL AC_DEFINE AC_CHECK_HEADER AC_CHECK_PROG AC_CHECK_FUNCS AC_COMPILE_IFELSE AC_CHECK_LIB AC_TRY_LINK AC_PREPROC_IFELSE AC_CONFIG_FILES AC_MSG_NOTICE If any of these macros, in its implementation, performs an AC_REQUIRE, you may get a problem, as described in [1]. Do we have a statement in the Autoconf documentation that none of the built-in macros does an AC_REQUIRE? I don't think so. Therefore I would suggest that the particular 'case' and 'if'/'test' statements - or even the entire main body of the configure.ac, from line 130 to line 5588 - gets wrapped in an AC_DEFUN that gets invoked once. The other alternative is to state formally that none of these AC_* macros do an AC_REQUIRE. But then it's hard to find a proper place for the gl_COMPILER_PREPARE_CHECK_DECL invocation (without adding extra code to configure scripts that don't use AC_CHECK_DECL). What would you suggest? Bruno [1] https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Prerequisite-Macros.html
Re: Messed up gl_COMPILER_PREPARE_CHECK_DECL
On 1/13/20 11:02 AM, Bruno Haible wrote: I would suggest that the particular 'case' and 'if'/'test' statements - or even the entire main body of the configure.ac, from line 130 to line 5588 - gets wrapped in an AC_DEFUN that gets invoked once. Thanks for the diagnosis. I came up with a simpler patch to Emacs, and installed it into Emacs master (attached). This patch doesn't solve the general problem, just this particular case. I doubt whether our collection of Emacs hackers can be induced to remember that the Autoconf macros you mentioned cannot be executed inside a shell condition, and I wouldn't be surprised if other configure scripts run into similar problems. I don't have any specific suggestion to work around this problem in Gnulib, though. PS. I vaguely recall a long discussion many years ago when we added this AC_REQUIRE-ish stuff to Autoconf. Although we did fix some major glitches, we replacing them with other glitches that live on to this day. For example, there's now this note in the Autoconf manual: Many Autoconf macros use a compiler, and thus call `AC_REQUIRE([AC_PROG_CC])' to ensure that the compiler has been determined before the body of the outermost `AC_DEFUN' macro. Although `AC_PROG_CC' is safe to directly expand multiple times, it performs certain checks (such as the proper value of `EXEEXT') only on the first invocation. Therefore, care must be used when invoking this macro from within another macro rather than at the top level (*note Expanded Before Required::). all of which I had forgotten until I read your email today. >From 49ad550af6b5c1cfcb2fd31962967d7be71bfcc3 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Mon, 13 Jan 2020 16:07:27 -0800 Subject: [PATCH] Port configure.ac to future Gnulib MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Rewrite an ancient Alpha ELF check to port to a future Gnulib version that may require AC_CHECK_DECL to be set up properly as per the ‘Expanded Before Required’ section of the Autoconf manual Autoconf doesn’t guarantee that AC_CHECK_DECL will work properly if called conditionally (e.g., inside a shell ‘case’ statement) and the condition is false. Problem reported by Bruno Haible in: https://lists.gnu.org/r/bug-gnulib/2020-01/msg00088.html * configure.ac (LD_SWITCH_MACHINE): Migrate ELF check later, when AC_CHECK_DECL is properly set up. --- configure.ac | 17 +++-- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/configure.ac b/configure.ac index 08a4502122..f040b748d0 100644 --- a/configure.ac +++ b/configure.ac @@ -1508,6 +1508,7 @@ AC_DEFUN UNEXEC_OBJ=unexelf.o ;; esac +AC_SUBST(UNEXEC_OBJ) LD_SWITCH_SYSTEM= test "$with_unexec" = no || case "$opsys" in @@ -1561,8 +1562,6 @@ AC_DEFUN test $with_unexec = yes && case $canonical in alpha*) - AC_CHECK_DECL([__ELF__]) - if test "$ac_cv_have_decl___ELF__" = "yes"; then ## With ELF, make sure that all common symbols get allocated to in the ## data section. Otherwise, the dump of temacs may miss variables in ## the shared library that have been initialized. For example, with @@ -1573,18 +1572,10 @@ AC_DEFUN else AC_MSG_ERROR([Non-GCC compilers are not supported.]) fi - else - dnl This was the unexalpha.c case. Removed in 24.1, 2010-07-24, - dnl albeit under the mistaken assumption that said file - dnl was no longer used. - AC_MSG_ERROR([Non-ELF systems are not supported since Emacs 24.1.]) - fi ;; esac AC_SUBST(C_SWITCH_MACHINE) -AC_SUBST(UNEXEC_OBJ) - C_SWITCH_SYSTEM= ## Some programs in src produce warnings saying certain subprograms ## are too complex and need a MAXMEM value greater than 2000 for @@ -4216,6 +4207,12 @@ AC_DEFUN AC_CHECK_FUNCS([aligned_alloc posix_memalign], [break]) AC_CHECK_DECLS([aligned_alloc], [], [], [[#include ]]) +case $with_unexec,$canonical in + yes,alpha*) +AC_CHECK_DECL([__ELF__], [], + [AC_MSG_ERROR([Non-ELF systems are not supported on this platform.])]);; +esac + # Dump loading AC_CHECK_FUNCS([posix_madvise]) -- 2.24.1