Branch: refs/heads/smoke-me/khw-env
  Home:   https://github.com/Perl/perl5
  Commit: 72cca921141163fecc7a16a78d5950f9a3e80432
      
https://github.com/Perl/perl5/commit/72cca921141163fecc7a16a78d5950f9a3e80432
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M embed.fnc
    M embed.h
    M embedvar.h
    M handy.h
    M inline.h
    M intrpvar.h
    M locale.c
    M makedef.pl
    M mg.c
    M perl.c
    M perl.h
    M pod/perlvar.pod
    M proto.h
    M sv.c

  Log Message:
  -----------
  Add ability to emulate thread-safe locale operations

Locale information was originally global for an entire process.  Later,
it was realized that different threads could want to be running in
different locales.  Windows added this ability, and POSIX 2008 followed
suit (though using a completely different API).  When available, perl
automatically uses these capabilities.

But many platforms have neither, or their implementation, such as on
Darwin, is buggy.  This commit adds the capability for Perl programs to
operate as if the platform were thread-safe.

This implementation is based on the observation that the underlying
locale matters only to relatively few libc calls, and only during their
execution.  It can be anything at all at any other time.  perl keeps
what the proper locale should be for each category in a a per-thread
array.  Each locale-dependent operation must be wrapped in mutex
lock/unlock operations.  The lock additionally compares what libc knows
the locale to be, and what it should be for this thread at this time,
and changes the actual locale to the proper value if necessary.  That's
all that is needed.

This commit adds macros to perl.h, for example "MBTOWC_LOCK_", that
expand to do the mutex lock, and change the global locale to the
expected value.  On perls built without this emulation capability, they
are no-ops.  All code in the perl core (unless I've missed something),
are changed to use these macros (there weren't actually many places that
needed this).  Thus, any pure perl program will automatically become
locale-thread-safe under this Configuration.

In order for XS code to also become locale-thread-safe, it must use
these macros to wrap calls to locale-dependent functions.  Relatively
few modules call such functions.  For example, the only one I found that
ships with the perl core is Time::Piece, and it has more fundamental
issues with running under threads than this.  I am preparing pull
requests for it.

Thus, this is not completely transparent to code like native-thread-safe
locale handling is.  Therefore ${^SAFE_LOCALES} returns 2 (instead of 1)
for this type of thread-safety.

Another deficiency compared to the native thread safety is when a thread
calls a non-perl library that accesses the locale.  The typical example is
Gtk (though this particular application can be configured to not be
problematic).  With the native safe threads, everything works as long as
only one such thread is used per Perl program.  That thread would then
be the only one operating in the global locale, hence there are no
conflicts.  With this emulation, all threads are operating in the global
locale, and mutexes would have to be used to prevent conflicts.  To
minimize those, the code added in this commit restores the global locale
when through to the state it was in when started.

A major concern is the performance impact.  This is after all trading
speed for accuracy.  lib/locale_threads.t is noticeably slower when this
is being used.  But that is doing multiple threads constantly using
locale-dependent operations.  I don't notice any change with the rest of
the test suite.  In pure perl, this only comes into play while in the
scope of 'use locale' or when using some of the few POSIX:: functions
that are locale-dependent.  And to some extent when formatting, but the
regular overhead there should dwarf what this adds.

This commit leaves this feature off by default.  The next commit changes
that for the next few 5.39 development releases, so we can see if there
is actually an issue.


  Commit: ced23ace06c24f7f1860c407aa29af170fbdeb3d
      
https://github.com/Perl/perl5/commit/ced23ace06c24f7f1860c407aa29af170fbdeb3d
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c
    M makedef.pl
    M perl.h

  Log Message:
  -----------
  Experimentally enable per-thread locale emulation

This is set to end in 5.39.10, but will give us field experience in the
meantime.


  Commit: c20ca85ae09adfdda187920c8fcdd1b43f03dcdf
      
https://github.com/Perl/perl5/commit/c20ca85ae09adfdda187920c8fcdd1b43f03dcdf
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M makedef.pl
    M perl.h

  Log Message:
  -----------
  Don't do thread-safe locales emulation on mingw

MingW when compiled with the Universal C runtime (UCRT) is thread-safe
with respect to locales, just as VS 2015 and later MSVCRT compilations
are.

However, versions not using UCRT cannot be compiled to emulate
thread-safe locale.  I'm pretty sure this is due to a bug in the libc
strftime() function, having spent a bunch of hours working on this.

It often fails lib/locale_threads.t when using the emulation, but not
always.  The failure is always in strftime().

What made me think it could be perl is another characteristic of the
failures.  lib/locale_threads.t works by, in each thread, setting each
available locale category to a locale, different from any other category
in that thread, and as different as possible from the locale for the
corresponding category in any other thread.  For example thread 0 might
have LC_CTYPE set to locale X, LC_NUMERIC to Y, LC_TIME to Z, etc.
Thread 1 would use a locale for LC_CTYPE, as different from X as
possible, meaning executing the same operation on thread 0 and thread 1
would yield different expected results.  (It goes to some lengths to
calculate the biggest distance in the results.)  Similarly LC_NUMERIC
would have something almost completely different from Y; and so on.

Then each thread executes a batch of iterations.  Each iteration runs
all the operations I could find that perl uses that apply to LC_TYPE,
and all the ones that apply to each of the other categories.  And
verifies that all the results are as expected.

Simultaneously, the other threads are executing their batch.  It is
verifying that there is no bleed-through from one thread to another.  If
the threads all have the same results as the other threads, we couldn't
detect if there is real bleed-through or not.  This is solved by making
the results for each category as different as possible from any other
thread currently executing.

However, this isn't good enough.  Every so many iterations, each thread
changes to use a new set of locales.  This verifies that the locales can
be changed in a thread without that bleeding through to other threads.

And thread 0 is special.  It harvests the other threads as they finish,
and keeps going for a while.  This is to catch bugs in thread
completion, of which we've had a few.

MingW's failures all occur, when they occur, on the first iteration
following a switch to a new set of locales.  That is suspiciously like
it is a race condition in cleaning up from the previous setting.  But it
isn't the first test of the set of the first iteration of the next set.
It can be the 10th or so test.  I added enough debugging statements to
convince me that it isn't perl.

This is the failing code in locale.c:

        STRFTIME_LOCK;
        int len = strftime(buf, bufsize, fmt, mytm);
        STRFTIME_UNLOCK;

The returned 'buf' is not always correct.
T
The LOCK/UNLOCK macros on MingW with thread-safe emulation enabled, call
EnterCriticalSection(), and set the locales for the categories that
affect strftime() to the proper locale.  Just to be sure. I tested
setting LC_ALL to the correct value.  While in its uninterruptible (by
other locale handling code anyway) section, strftime() fills buf with
the result for the current locale (which STRFTIME_LOCK has set).

I added print statements within the critical section thusly

        STRFTIME_LOCK;
        DEBUG_U(PerlIO_printf(Perl_debug_log,
                              "calling strftime(%s), LC_ALL=%s\n",
                              fmt, setlocale(LC_ALL, NULL)));
        int len = strftime(buf, bufsize, fmt, mytm);
        DEBUG_U(PerlIO_printf(Perl_debug_log,
                              "return=%s, LC_ALL=%s\n",
                              buf, setlocale(LC_ALL, NULL)));
        STRFTIME_UNLOCK;

On this platform, setlocale() expands to _wsetlocale(), a Windows libc
call.

Here's what they showed for one failure.

        calling strftime(%b), LC_ALL=Hungarian_Hungary.1250
        return=marc., LC_ALL=Hungarian_Hungary.1250

The 'a' in the Hungarian for March is supposed to be a U+00E1, with an
acute accent, so this is wrong.

strftime() also is passed a pointer to a struct tm, which is filled in
with various integers which indicate in this case which month the %b is
supposed to return.  That it is returning something very much like márch
indicates those integers are correct.

Not shown in the example above are the other prints I added to verify
that we are indeed in a critical section.  I didn't see a way to
actually test for this via a libc call, but one could use strace and
wade through the output.  But there are print statements that print out
immediately before entering a critical section, and immediately after
leaving it.  I verified that those prints indicate this code is in a
critical section.

I note that this box has actually not very many locales, so that the
distance between the results of various threads isn't all that large.
Pretty much all the locales are CP 1250, 1251, 1252, and 1257, and no
UTF-8 ones, so all locales are single byte.  None of them map \XE1 into
plain 'a', which is what we are seeing returned, so the cleanup theory
seems wrong.  Sometimes the return is '?' or a series of them,
indicating that the returned character is mojibake.

None of the locales I saw had 'marc\.' as a possible return.  It appears
only here in the entire trace of all threads.  This makes it again less
likely that it is a cleanup issue.  But where did it come from?.  I
don't know.  The value for the C locale is 'Mar', so it didn't come from
there.

The localeconv() function is also broken in this Configuration.  We long
ago figured out a workaround for that.  I tried that same workaround for
strftime(), and it didn't help.


  Commit: 2d6162801161dd7ce23f0f72e2fff95b854e180e
      
https://github.com/Perl/perl5/commit/2d6162801161dd7ce23f0f72e2fff95b854e180e
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  DEBUG Lv to U


  Commit: ddff4d33f242ab173644f3d6a96b3907dd84ecf8
      
https://github.com/Perl/perl5/commit/ddff4d33f242ab173644f3d6a96b3907dd84ecf8
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  extra debug


  Commit: 21462fd424e95ccbef97a5916098bd6abec8408d
      
https://github.com/Perl/perl5/commit/21462fd424e95ccbef97a5916098bd6abec8408d
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  more emul locks


  Commit: e10ba56c3de4ed7b5739451c79e8e8e94928829b
      
https://github.com/Perl/perl5/commit/e10ba56c3de4ed7b5739451c79e8e8e94928829b
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  Revert "more emul locks"

This reverts commit 4733a1674423ee47b33eb0ee1882e1bf39faa1a6.


  Commit: e7fec5c1dad928480c718cf5757323b828ac7a99
      
https://github.com/Perl/perl5/commit/e7fec5c1dad928480c718cf5757323b828ac7a99
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  langinfo lock


  Commit: 95149406ec35774b39858e101b42f9264a950068
      
https://github.com/Perl/perl5/commit/95149406ec35774b39858e101b42f9264a950068
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  Revert "langinfo lock"

This reverts commit acaff35d7ed83830fb36c149aafede5cdf400061.


  Commit: a1ea6b95000a9e0a7902027d5e2eaa36f1d153f4
      
https://github.com/Perl/perl5/commit/a1ea6b95000a9e0a7902027d5e2eaa36f1d153f4
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  lock mask


  Commit: db84671fa4873b43f38c7d3489786a671348d76b
      
https://github.com/Perl/perl5/commit/db84671fa4873b43f38c7d3489786a671348d76b
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  Revert "lock mask"

This reverts commit 3fd528c9d5d5b9c05dc1c697e61570b81811fb95.


  Commit: 5516bc47b36c8c1db8c22afbeb24371b114bdcb9
      
https://github.com/Perl/perl5/commit/5516bc47b36c8c1db8c22afbeb24371b114bdcb9
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  locale.c: Maybe comment'


  Commit: 84efee8a847b1875aba2b692b8897eb5c35cd245
      
https://github.com/Perl/perl5/commit/84efee8a847b1875aba2b692b8897eb5c35cd245
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M perl.h

  Log Message:
  -----------
  XXX perl.h maybe drop


  Commit: 0261e252f19c508a06f2653cc8633adbd7733d47
      
https://github.com/Perl/perl5/commit/0261e252f19c508a06f2653cc8633adbd7733d47
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M makedef.pl

  Log Message:
  -----------
  makedef.pl: PL_cur_locale_obj is only POSIX 2008 multiplicity


  Commit: 8a81e9366fa55443e2cf66ff587cc5b08b8b7d60
      
https://github.com/Perl/perl5/commit/8a81e9366fa55443e2cf66ff587cc5b08b8b7d60
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M lib/locale_threads.t

  Log Message:
  -----------
  locale_threads.t: Better handle weird locales

The previous code was generating bunches of uninitialized variable
warnings, due to 1) not checking for definedness early; 2) the loop
termination needs to be reevaluated each time because there is a
potential splice, shortening the array.

This only happens, I believe, on MingW not using UCRT.


  Commit: d1b6664a466b9c62895ff5859bf0e7e6fd0f8195
      
https://github.com/Perl/perl5/commit/d1b6664a466b9c62895ff5859bf0e7e6fd0f8195
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M lib/locale_threads.t

  Log Message:
  -----------
  Revert "locale_threads.t: Skip on OpenBSD and DragonFly threaded builds"

This reverts commit 1d74e8214dd53cf0fa9e8c5aab3e6187685eadcd, as they
have been modified


  Commit: f14e679c26ab22c9bb6713e0999872345b3a941c
      
https://github.com/Perl/perl5/commit/f14e679c26ab22c9bb6713e0999872345b3a941c
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  Debug uselocale


  Commit: e6be6275903edca66c22d724033996bcd1680814
      
https://github.com/Perl/perl5/commit/e6be6275903edca66c22d724033996bcd1680814
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M pp.c

  Log Message:
  -----------
  pp_study: hook


  Commit: e72f1fd1033587ac06c2ba6e6e1de09738a7eb0e
      
https://github.com/Perl/perl5/commit/e72f1fd1033587ac06c2ba6e6e1de09738a7eb0e
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M sv.c

  Log Message:
  -----------
  sv.c need to check for pv in sv in sv_setpvf


  Commit: 362acbeb5480fb7c4829ee10520a01a4ba5ce8fc
      
https://github.com/Perl/perl5/commit/362acbeb5480fb7c4829ee10520a01a4ba5ce8fc
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M sv.c

  Log Message:
  -----------
  perlapi: Add detail to sv_setpv_bufsize()


  Commit: ba90a5428d3c54cec3507880fec97d13b86b1e15
      
https://github.com/Perl/perl5/commit/ba90a5428d3c54cec3507880fec97d13b86b1e15
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M perl.h

  Log Message:
  -----------
  locale mutexes: Win32 are general without simulating

We can get rid of the simulation needed for other platforms.


  Commit: 6877a56558270e9656a000fff9ecb14cf593136c
      
https://github.com/Perl/perl5/commit/6877a56558270e9656a000fff9ecb14cf593136c
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  White


  Commit: 08fb06533e1799f9452efd5610e43aba7ef49c53
      
https://github.com/Perl/perl5/commit/08fb06533e1799f9452efd5610e43aba7ef49c53
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  fundamental


  Commit: a2c12b85720879431f8ac77eb00925e27810eadd
      
https://github.com/Perl/perl5/commit/a2c12b85720879431f8ac77eb00925e27810eadd
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M embed.fnc
    M embed.h
    M locale.c
    M proto.h

  Log Message:
  -----------
  immediate use


  Commit: 38492b3886cc17a6866281a000a1955f89f3398c
      
https://github.com/Perl/perl5/commit/38492b3886cc17a6866281a000a1955f89f3398c
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  more immed


  Commit: 50b42ce7d182158de7ebe2c7e6eb6a0e7a7b27d4
      
https://github.com/Perl/perl5/commit/50b42ce7d182158de7ebe2c7e6eb6a0e7a7b27d4
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M embed.fnc
    M embed.h
    M locale.c
    M proto.h

  Log Message:
  -----------
  add is_cur_locale_utf8


  Commit: 0a4ddf35874ea992223c972a62bc55869ca8550d
      
https://github.com/Perl/perl5/commit/0a4ddf35874ea992223c972a62bc55869ca8550d
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  move subs around


  Commit: 8083cab90a486a15249eeaa366819c79e3b611ab
      
https://github.com/Perl/perl5/commit/8083cab90a486a15249eeaa366819c79e3b611ab
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  locale.c: Comments, white space


  Commit: c8f32179ff4f82ae8a16db5abd8cd84b69b17c41
      
https://github.com/Perl/perl5/commit/c8f32179ff4f82ae8a16db5abd8cd84b69b17c41
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  MULT


  Commit: e10268aacd4259dda8f43a21540731ebb9ef11b9
      
https://github.com/Perl/perl5/commit/e10268aacd4259dda8f43a21540731ebb9ef11b9
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M locale.c

  Log Message:
  -----------
  final


  Commit: e4abdaece6adcdf925cf87d2780ff28f0a4d7d29
      
https://github.com/Perl/perl5/commit/e4abdaece6adcdf925cf87d2780ff28f0a4d7d29
  Author: Paul "LeoNerd" Evans <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M class.c

  Log Message:
  -----------
  class.c: Correct allocation of OP_ARGCHECK aux structure

The original code was wrong on two counts:
 * Using Newx() instead of PerlMemShared_malloc()
 * Creating a generic UNOP_AUX_item array instead of the special struct
   type


  Commit: ca6cc1f00ae928faf2e12aed4479fb9ea7eefbb2
      
https://github.com/Perl/perl5/commit/ca6cc1f00ae928faf2e12aed4479fb9ea7eefbb2
  Author: Karl Williamson <[email protected]>
  Date:   2024-02-09 (Fri, 09 Feb 2024)

  Changed paths:
    M t/loc_tools.pl

  Log Message:
  -----------
  t/loc_tools: Don't return duplicates

Make sure the return of find_locales() (and hence any internal subs
that call it) has no repeated locale names.


Compare: https://github.com/Perl/perl5/compare/0a792b642405...ca6cc1f00ae9

Reply via email to