[The message below was originally sent only to Arnold, but I intended
it to go to 23...@debbugs.gnu.org as well. Seeing as the conversation
regarding multi-threaded grep operation is continuing, I've decided to
forward it to the bug list. Apologies to Arnold (and others as
appropriate) if this is a duplicate. -- sur-behoffski]
-------- Forwarded Message --------
Subject: Re: bug#23269: Multi-threaded operation, mbrtowc, and "untangle"
script [was Re: bug#23269...]
Date: Thu, 21 Apr 2016 21:32:15 +0930
From: sur-behoffski <sur_behoff...@grouse.com.au>
To: arn...@skeeve.com
On 04/21/16 19:25, arn...@skeeve.com wrote:
sur-behoffski <sur_behoff...@grouse.com.au> wrote:
So, I'm not sure if a thread-safe (i.e. locale-safe) version of mbrtowc
exists; if not, this needs to be addressed before a split-locale,
multi-threaded version is feasible. (LC_CTYPE race conditions?)
By definition, mbrtowc is thread safe. The question relates better
to setlocale(), or rather to the underlying internal locale data. I don't
think the current POSIX model lends itself to multiple locales within
the same process.
Thanks for the response. As noted in the man pages, the thread safety
does not extend to multi-locale settings, and this is explicitly what Paul
was hoping for in the message that I replied to:
On 04/21/16 02:10, Paul Eggert wrote:
> [...]
> One thing that bugged me about dfa.c (when I was looking at this
> yesterday) is that it maintains some state in static variables, which
> means it can't be used in multiple threads using different locales.
> That's not an issue with grep or gawk now, but might be for other
> apps and might conceivably be a problem even in grep, which has a
> multithreaded patch pending and might conceivably want to use per-file
> encodings. [...]
"man 3 mbrtowc" on my Gentoo system has the following text in the ATTRIBUTES,
CONFORMING TO, NOTES and COLOPHON sections:
------ (Start of excerpt) ------
ATTRIBUTES
For an explanation of the terms used in this section, see attributes(7).
+----------+---------------+----------------------------+
|Interface | Attribute | Value |
+----------+---------------+----------------------------+
|mbrtowc() | Thread safety | MT-Unsafe race:mbrtowc/!ps |
+----------+---------------+----------------------------+
CONFORMING TO
POSIX.1-2001, POSIX.1-2008, C99.
NOTES
The behavior of mbrtowc() depends on the LC_CTYPE category of the
current locale.
[...]
COLOPHON
This page is part of release 4.04 of the Linux man-pages project. A
description of the
project, information about reporting bugs, and the latest version of
this page, can be
found at http://www.kernel.org/doc/man-pages/.
GNU 2015-08-08
MBRTOWC(3)
------ (End of excerpt) ------
cheers,
sur-behoffski (Brenton Hoff)
Programmer, Grouse Software