Unicode support in poke

2020-01-13 Thread Bruno Haible
Hi José,

Yesterday, you identified a set of functions from GNU libunistring that would
be useful to use in GNU poke. Since you will need only a few such functions,
which sums up to little code and only one big table, you can take the
respective modules from gnulib - a regular use of gnulib-tool. All the source
code of libunistring is in gnulib, distributed across ca. 350 modules.

The list of modules is as follows. The relation between function name and
module name is obvious.

unitypes
unistr/base
unistr/u8-check
unistr/u8-chr
unistr/u8-cmp
unistr/u8-cmp2
unistr/u8-cpy
unistr/u8-cpy-alloc
unistr/u8-endswith
unistr/u8-mblen
unistr/u8-mbsnlen
unistr/u8-mbtouc
unistr/u8-mbtoucr
unistr/u8-mbtouc-unsafe
unistr/u8-move
unistr/u8-next
unistr/u8-prev
unistr/u8-set
unistr/u8-startswith
unistr/u8-stpcpy
unistr/u8-stpncpy
unistr/u8-strcat
unistr/u8-strchr
unistr/u8-strcmp
unistr/u8-strcoll
unistr/u8-strcpy
unistr/u8-strcspn
unistr/u8-strdup
unistr/u8-strlen
unistr/u8-strmblen
unistr/u8-strmbtouc
unistr/u8-strncat
unistr/u8-strncmp
unistr/u8-strncpy
unistr/u8-strnlen
unistr/u8-strpbrk
unistr/u8-strrchr
unistr/u8-strspn
unistr/u8-strstr
unistr/u8-strtok
unistr/u8-to-u16
unistr/u8-to-u32
unistr/u8-uctomb
unistr/u16-check
unistr/u16-chr
unistr/u16-cmp
unistr/u16-cmp2
unistr/u16-cpy
unistr/u16-cpy-alloc
unistr/u16-endswith
unistr/u16-mblen
unistr/u16-mbsnlen
unistr/u16-mbtouc
unistr/u16-mbtoucr
unistr/u16-mbtouc-unsafe
unistr/u16-move
unistr/u16-next
unistr/u16-prev
unistr/u16-set
unistr/u16-startswith
unistr/u16-stpcpy
unistr/u16-stpncpy
unistr/u16-strcat
unistr/u16-strchr
unistr/u16-strcmp
unistr/u16-strcoll
unistr/u16-strcpy
unistr/u16-strcspn
unistr/u16-strdup
unistr/u16-strlen
unistr/u16-strmblen
unistr/u16-strmbtouc
unistr/u16-strncat
unistr/u16-strncmp
unistr/u16-strncpy
unistr/u16-strnlen
unistr/u16-strpbrk
unistr/u16-strrchr
unistr/u16-strspn
unistr/u16-strstr
unistr/u16-strtok
unistr/u16-to-u32
unistr/u16-to-u8
unistr/u16-uctomb
unistr/u32-check
unistr/u32-chr
unistr/u32-cmp
unistr/u32-cmp2
unistr/u32-cpy
unistr/u32-cpy-alloc
unistr/u32-endswith
unistr/u32-mblen
unistr/u32-mbsnlen
unistr/u32-mbtouc
unistr/u32-mbtoucr
unistr/u32-mbtouc-unsafe
unistr/u32-move
unistr/u32-next
unistr/u32-prev
unistr/u32-set
unistr/u32-startswith
unistr/u32-stpcpy
unistr/u32-stpncpy
unistr/u32-strcat
unistr/u32-strchr
unistr/u32-strcmp
unistr/u32-strcoll
unistr/u32-strcpy
unistr/u32-strcspn
unistr/u32-strdup
unistr/u32-strlen
unistr/u32-strmblen
unistr/u32-strmbtouc
unistr/u32-strncat
unistr/u32-strncmp
unistr/u32-strncpy
unistr/u32-strnlen
unistr/u32-strpbrk
unistr/u32-strrchr
unistr/u32-strspn
unistr/u32-strstr
unistr/u32-strtok
unistr/u32-to-u16
unistr/u32-to-u8
unistr/u32-uctomb
uniconv/base
uniconv/u8-conv-from-enc
uniconv/u8-conv-to-enc
uniconv/u8-strconv-from-enc
uniconv/u8-strconv-from-locale
uniconv/u8-strconv-to-enc
uniconv/u8-strconv-to-locale
uniconv/u16-conv-from-enc
uniconv/u16-conv-to-enc
uniconv/u16-strconv-from-enc
uniconv/u16-strconv-from-locale
uniconv/u16-strconv-to-enc
uniconv/u16-strconv-to-locale
uniconv/u32-conv-from-enc
uniconv/u32-conv-to-enc
uniconv/u32-strconv-from-enc
uniconv/u32-strconv-from-locale
uniconv/u32-strconv-to-enc
uniconv/u32-strconv-to-locale
unistdio/base
unistdio/u8-asnprintf
unistdio/u8-asprintf
unistdio/u8-snprintf
unistdio/u8-sprintf
unistdio/u8-u8-asnprintf
unistdio/u8-u8-asprintf
unistdio/u8-u8-snprintf
unistdio/u8-u8-sprintf
unistdio/u8-u8-vasnprintf
unistdio/u8-u8-vasprintf
unistdio/u8-u8-vsnprintf
unistdio/u8-u8-vsprintf
unistdio/u8-vasnprintf
unistdio/u8-vasprintf
unistdio/u8-vsnprintf
unistdio/u8-vsprintf
unistdio/u16-asnprintf
unistdio/u16-asprintf
unistdio/u16-snprintf
unistdio/u16-sprintf
unistdio/u16-u16-asnprintf
unistdio/u16-u16-asprintf
unistdio/u16-u16-snprintf
unistdio/u16-u16-sprintf
unistdio/u16-u16-vasnprintf
unistdio/u16-u16-vasprintf
unistdio/u16-u16-vsnprintf
unistdio/u16-u16-vsprintf
unistdio/u16-vasnprintf
unistdio/u16-vasprintf
unistdio/u16-vsnprintf
unistdio/u16-vsprintf
unistdio/u32-asnprintf
unistdio/u32-asprintf
unistdio/u32-snprintf
unistdio/u32-sprintf
unistdio/u32-u32-asnprintf
unistdio/u32-u32-asprintf
unistdio/u32-u32-snprintf
unistdio/u32-u32-sprintf
unistdio/u32-u32-vasnprintf
unistdio/u32-u32-vasprintf
unistdio/

Re: Unicode support in poke

2020-01-13 Thread Tim Rühsen
Hi José,

you could look at libidn2 as an example how to use system libunistring
if there (or if new enough) and fallback to gnulib unistring.
(BTW, libunistring is made of the gnulib unistring modules)

It creates a separate dir / library for gnulib unistring functions,
*BUT* only uses it when a system libunistring can't be found.

bootstrap.conf: Call gnulib-tool in bootstrap_post_import_hook() only
for the needed unistring modules.

configure.ac: Check for system libunistring (set a conditional
HAVE_LIBUNISTRING

Makefile.am: if !HAVE_LIBUNISTRING -> add unistring/ to SUBDIR

*/Makefile.am: if !HAVE_LIBUNISTRING -> add unistring/ to includes in
AM_CPPFLAGS

I think, that's all.

Regards, Tim

On 1/13/20 11:33 AM, Bruno Haible wrote:
> Hi José,
> 
> Yesterday, you identified a set of functions from GNU libunistring that would
> be useful to use in GNU poke. Since you will need only a few such functions,
> which sums up to little code and only one big table, you can take the
> respective modules from gnulib - a regular use of gnulib-tool. All the source
> code of libunistring is in gnulib, distributed across ca. 350 modules.
> 
> The list of modules is as follows. The relation between function name and
> module name is obvious.
> 
> unitypes
> unistr/base
> unistr/u8-check
> unistr/u8-chr
> unistr/u8-cmp
> unistr/u8-cmp2
> unistr/u8-cpy
> unistr/u8-cpy-alloc
> unistr/u8-endswith
> unistr/u8-mblen
> unistr/u8-mbsnlen
> unistr/u8-mbtouc
> unistr/u8-mbtoucr
> unistr/u8-mbtouc-unsafe
> unistr/u8-move
> unistr/u8-next
> unistr/u8-prev
> unistr/u8-set
> unistr/u8-startswith
> unistr/u8-stpcpy
> unistr/u8-stpncpy
> unistr/u8-strcat
> unistr/u8-strchr
> unistr/u8-strcmp
> unistr/u8-strcoll
> unistr/u8-strcpy
> unistr/u8-strcspn
> unistr/u8-strdup
> unistr/u8-strlen
> unistr/u8-strmblen
> unistr/u8-strmbtouc
> unistr/u8-strncat
> unistr/u8-strncmp
> unistr/u8-strncpy
> unistr/u8-strnlen
> unistr/u8-strpbrk
> unistr/u8-strrchr
> unistr/u8-strspn
> unistr/u8-strstr
> unistr/u8-strtok
> unistr/u8-to-u16
> unistr/u8-to-u32
> unistr/u8-uctomb
> unistr/u16-check
> unistr/u16-chr
> unistr/u16-cmp
> unistr/u16-cmp2
> unistr/u16-cpy
> unistr/u16-cpy-alloc
> unistr/u16-endswith
> unistr/u16-mblen
> unistr/u16-mbsnlen
> unistr/u16-mbtouc
> unistr/u16-mbtoucr
> unistr/u16-mbtouc-unsafe
> unistr/u16-move
> unistr/u16-next
> unistr/u16-prev
> unistr/u16-set
> unistr/u16-startswith
> unistr/u16-stpcpy
> unistr/u16-stpncpy
> unistr/u16-strcat
> unistr/u16-strchr
> unistr/u16-strcmp
> unistr/u16-strcoll
> unistr/u16-strcpy
> unistr/u16-strcspn
> unistr/u16-strdup
> unistr/u16-strlen
> unistr/u16-strmblen
> unistr/u16-strmbtouc
> unistr/u16-strncat
> unistr/u16-strncmp
> unistr/u16-strncpy
> unistr/u16-strnlen
> unistr/u16-strpbrk
> unistr/u16-strrchr
> unistr/u16-strspn
> unistr/u16-strstr
> unistr/u16-strtok
> unistr/u16-to-u32
> unistr/u16-to-u8
> unistr/u16-uctomb
> unistr/u32-check
> unistr/u32-chr
> unistr/u32-cmp
> unistr/u32-cmp2
> unistr/u32-cpy
> unistr/u32-cpy-alloc
> unistr/u32-endswith
> unistr/u32-mblen
> unistr/u32-mbsnlen
> unistr/u32-mbtouc
> unistr/u32-mbtoucr
> unistr/u32-mbtouc-unsafe
> unistr/u32-move
> unistr/u32-next
> unistr/u32-prev
> unistr/u32-set
> unistr/u32-startswith
> unistr/u32-stpcpy
> unistr/u32-stpncpy
> unistr/u32-strcat
> unistr/u32-strchr
> unistr/u32-strcmp
> unistr/u32-strcoll
> unistr/u32-strcpy
> unistr/u32-strcspn
> unistr/u32-strdup
> unistr/u32-strlen
> unistr/u32-strmblen
> unistr/u32-strmbtouc
> unistr/u32-strncat
> unistr/u32-strncmp
> unistr/u32-strncpy
> unistr/u32-strnlen
> unistr/u32-strpbrk
> unistr/u32-strrchr
> unistr/u32-strspn
> unistr/u32-strstr
> unistr/u32-strtok
> unistr/u32-to-u16
> unistr/u32-to-u8
> unistr/u32-uctomb
> uniconv/base
> uniconv/u8-conv-from-enc
> uniconv/u8-conv-to-enc
> uniconv/u8-strconv-from-enc
> uniconv/u8-strconv-from-locale
> uniconv/u8-strconv-to-enc
> uniconv/u8-strconv-to-locale
> uniconv/u16-conv-from-enc
> uniconv/u16-conv-to-enc
> uniconv/u16-strconv-from-enc
> uniconv/u16-strconv-from-locale
> uniconv/u16-strconv-to-enc
> uniconv/u16-strconv-to-locale
> uniconv/u32-conv-from-enc
> uniconv/u32-conv-to-enc
> uniconv/u32-strconv-from-enc
> uniconv/u32-strconv-from-locale
> uniconv/u32-strconv-to-enc
> uniconv/u32-strconv-to-locale
> unistdio/base
> unistdio/u8-asnprintf
> unistdio/u8-asprintf
> unistdio/u8-snprintf
> unistdio/u8-sprintf
> unistdio/u8-u8-a

presentation at GHM

2020-01-13 Thread Bruno Haible
At the first GNU Hackers Meeting of 2020 I presented the recent and current
work on gnulib. Here are the slides.



haible-gnulib-2020.pdf
Description: Adobe PDF document


Re: Unicode support in poke

2020-01-13 Thread Bruno Haible
Hi Tim,

> you could look at libidn2 as an example how to use system libunistring
> if there (or if new enough) and fallback to gnulib unistring.
> (BTW, libunistring is made of the gnulib unistring modules)
> 
> It creates a separate dir / library for gnulib unistring functions,
> *BUT* only uses it when a system libunistring can't be found.
> 
> bootstrap.conf: Call gnulib-tool in bootstrap_post_import_hook() only
> for the needed unistring modules.
> 
> configure.ac: Check for system libunistring (set a conditional
> HAVE_LIBUNISTRING
> 
> Makefile.am: if !HAVE_LIBUNISTRING -> add unistring/ to SUBDIR
> 
> */Makefile.am: if !HAVE_LIBUNISTRING -> add unistring/ to includes in
> AM_CPPFLAGS

A simpler way to achieve the same thing is to include the gnulib module
'libunistring-optional'. It will use the system libunistring if it
exists and is new enough, and otherwise compile the respective modules
from source.

Bruno




Re: Unicode support in poke

2020-01-13 Thread Tim Rühsen
Hi Bruno,

On 1/13/20 12:01 PM, Bruno Haible wrote:
> Hi Tim,
> 
>> you could look at libidn2 as an example how to use system libunistring
>> if there (or if new enough) and fallback to gnulib unistring.
>> (BTW, libunistring is made of the gnulib unistring modules)
>>
>> It creates a separate dir / library for gnulib unistring functions,
>> *BUT* only uses it when a system libunistring can't be found.
>>
>> bootstrap.conf: Call gnulib-tool in bootstrap_post_import_hook() only
>> for the needed unistring modules.
>>
>> configure.ac: Check for system libunistring (set a conditional
>> HAVE_LIBUNISTRING
>>
>> Makefile.am: if !HAVE_LIBUNISTRING -> add unistring/ to SUBDIR
>>
>> */Makefile.am: if !HAVE_LIBUNISTRING -> add unistring/ to includes in
>> AM_CPPFLAGS
> 
> A simpler way to achieve the same thing is to include the gnulib module
> 'libunistring-optional'. It will use the system libunistring if it
> exists and is new enough, and otherwise compile the respective modules
> from source.

Ah, I didn't know that, thanks.

Don't that pull in all the unistring modules (and code) ? At least for
building the project.

In libidn2 we just use a small subset of the vast functionality of
gnulib unistring. In order to keep build overhead small, isn't that the
way to go ? I only know that building libunistring takes a while...

Regards, Tim



signature.asc
Description: OpenPGP digital signature


Re: Unicode support in poke

2020-01-13 Thread Bruno Haible
Hi Tim,

> > A simpler way to achieve the same thing is to include the gnulib module
> > 'libunistring-optional'. It will use the system libunistring if it
> > exists and is new enough, and otherwise compile the respective modules
> > from source.
> 
> Ah, I didn't know that, thanks.
> 
> Don't that pull in all the unistring modules (and code) ? At least for
> building the project.

No, it will only pull in the modules that you have requested and their
dependencies.

Bruno




Re: Messed up gl_COMPILER_PREPARE_CHECK_DECL

2020-01-13 Thread Bruno Haible
Hi Paul,

> that incorrect line comes because ac_compile_for_check_decl is used 
> before it is set. And this occurs because Emacs's configure.ac's first 
> use of AC_CHECK_DECL is executed only on alpha platforms (which my 
> platform is not), which means the initialization of 
> ac_compile_for_check_decl is skipped.

The Emacs configure.ac is not written in a robust way. It contains
'case' and 'if'/'test' statements which conditionally execute the Autoconf
macros

  AC_MSG_CHECKING
  AC_MSG_ERROR
  AC_MSG_WARN
  AC_MSG_RESULT
  AC_CACHE_CHECK
  AC_LINK_IFELSE
  AC_LANG_PROGRAM
  AC_PATH_PROG
  AC_CHECK_DECL
  AC_DEFINE
  AC_CHECK_HEADER
  AC_CHECK_PROG
  AC_CHECK_FUNCS
  AC_COMPILE_IFELSE
  AC_CHECK_LIB
  AC_TRY_LINK
  AC_PREPROC_IFELSE
  AC_CONFIG_FILES
  AC_MSG_NOTICE

If any of these macros, in its implementation, performs an AC_REQUIRE, you
may get a problem, as described in [1].

Do we have a statement in the Autoconf documentation that none of the built-in
macros does an AC_REQUIRE? I don't think so. Therefore I would suggest that
the particular 'case' and 'if'/'test' statements - or even the entire main body
of the configure.ac, from line 130 to line 5588 - gets wrapped in an AC_DEFUN
that gets invoked once.

The other alternative is to state formally that none of these AC_* macros do
an AC_REQUIRE. But then it's hard to find a proper place for the
gl_COMPILER_PREPARE_CHECK_DECL invocation (without adding extra code to 
configure
scripts that don't use AC_CHECK_DECL). What would you suggest?

Bruno

[1] 
https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Prerequisite-Macros.html




Re: Messed up gl_COMPILER_PREPARE_CHECK_DECL

2020-01-13 Thread Paul Eggert

On 1/13/20 11:02 AM, Bruno Haible wrote:

I would suggest that
the particular 'case' and 'if'/'test' statements - or even the entire main body
of the configure.ac, from line 130 to line 5588 - gets wrapped in an AC_DEFUN
that gets invoked once.


Thanks for the diagnosis. I came up with a simpler patch to Emacs, and 
installed it into Emacs master (attached).


This patch doesn't solve the general problem, just this particular case. 
I doubt whether our collection of Emacs hackers can be induced to 
remember that the Autoconf macros you mentioned cannot be executed 
inside a shell condition, and I wouldn't be surprised if other configure 
scripts run into similar problems. I don't have any specific suggestion 
to work around this problem in Gnulib, though.


PS. I vaguely recall a long discussion many years ago when we added this 
AC_REQUIRE-ish stuff to Autoconf. Although we did fix some major 
glitches, we replacing them with other glitches that live on to this 
day. For example, there's now this note in the Autoconf manual:


 Many Autoconf macros use a compiler, and thus call
 `AC_REQUIRE([AC_PROG_CC])' to ensure that the compiler has been
 determined before the body of the outermost `AC_DEFUN' macro.
 Although `AC_PROG_CC' is safe to directly expand multiple times, it
 performs certain checks (such as the proper value of `EXEEXT') only
 on the first invocation.  Therefore, care must be used when
 invoking this macro from within another macro rather than at the
 top level (*note Expanded Before Required::).

all of which I had forgotten until I read your email today.
>From 49ad550af6b5c1cfcb2fd31962967d7be71bfcc3 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 13 Jan 2020 16:07:27 -0800
Subject: [PATCH] Port configure.ac to future Gnulib
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Rewrite an ancient Alpha ELF check to port to a future Gnulib
version that may require AC_CHECK_DECL to be set up properly as
per the ‘Expanded Before Required’ section of the Autoconf manual
Autoconf doesn’t guarantee that AC_CHECK_DECL will work properly
if called conditionally (e.g., inside a shell ‘case’ statement)
and the condition is false.  Problem reported by Bruno Haible in:
https://lists.gnu.org/r/bug-gnulib/2020-01/msg00088.html
* configure.ac (LD_SWITCH_MACHINE): Migrate ELF check later,
when AC_CHECK_DECL is properly set up.
---
 configure.ac | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/configure.ac b/configure.ac
index 08a4502122..f040b748d0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1508,6 +1508,7 @@ AC_DEFUN
UNEXEC_OBJ=unexelf.o
;;
 esac
+AC_SUBST(UNEXEC_OBJ)
 
 LD_SWITCH_SYSTEM=
 test "$with_unexec" = no || case "$opsys" in
@@ -1561,8 +1562,6 @@ AC_DEFUN
 test $with_unexec = yes &&
 case $canonical in
  alpha*)
-  AC_CHECK_DECL([__ELF__])
-  if test "$ac_cv_have_decl___ELF__" = "yes"; then
 ## With ELF, make sure that all common symbols get allocated to in the
 ## data section.  Otherwise, the dump of temacs may miss variables in
 ## the shared library that have been initialized.  For example, with
@@ -1573,18 +1572,10 @@ AC_DEFUN
 else
   AC_MSG_ERROR([Non-GCC compilers are not supported.])
 fi
-  else
-  dnl This was the unexalpha.c case.  Removed in 24.1, 2010-07-24,
-  dnl albeit under the mistaken assumption that said file
-  dnl was no longer used.
-  AC_MSG_ERROR([Non-ELF systems are not supported since Emacs 24.1.])
-  fi
   ;;
 esac
 AC_SUBST(C_SWITCH_MACHINE)
 
-AC_SUBST(UNEXEC_OBJ)
-
 C_SWITCH_SYSTEM=
 ## Some programs in src produce warnings saying certain subprograms
 ## are too complex and need a MAXMEM value greater than 2000 for
@@ -4216,6 +4207,12 @@ AC_DEFUN
 AC_CHECK_FUNCS([aligned_alloc posix_memalign], [break])
 AC_CHECK_DECLS([aligned_alloc], [], [], [[#include ]])
 
+case $with_unexec,$canonical in
+  yes,alpha*)
+AC_CHECK_DECL([__ELF__], [],
+  [AC_MSG_ERROR([Non-ELF systems are not supported on this platform.])]);;
+esac
+
 # Dump loading
 AC_CHECK_FUNCS([posix_madvise])
 
-- 
2.24.1