Re: [bug-gettext] RFC: move LANGUAGE check out of gettext()

Bruno Haible Tue, 10 May 2016 12:49:27 -0700

Hi Daiki,

> A while ago Matthias Clasen pointed me to a bug that is caused by a race
> condition between a getenv() call in gettext() and a setenv() call in
> another thread:
> https://bugzilla.gnome.org/show_bug.cgi?id=754951
> 
> The direct cause of this bug is that gettext() tries to check LANGUAGE
> envvar, while the string content returned by getenv() can be overwritten
> by setenv() before being used.


And the deeper cause of this bug is that programs are calling setenv()
in a multi-threaded program, although the Glibc manual
http://www.gnu.org/software/libc/manual/html_node/Environment-Access.html
says:
  "Modifications of environment variables are not allowed in multi-threaded
   programs."
There's a similar rant regarding setenv() in
  http://www.club.cc.cmu.edu/~cmccabe/blog_the_setenv_fiasco.html


Why is this being reported for the LANGUAGE environment variable but not
for the LANG and LC_ALL environment variables? Because for LANG and LC_*
we have an architecture composed of three functionalities:

  (A) environment variables: getenv(), setenv()

  (B) locales: setlocale(), newlocale(), uselocale().

  (C) gettext() and friends.

(A) is the bottom-most layer. But it has the limitation that multi-threaded
programs must not call setenv().

(B) is a layer that fetches the initial values from (A), and that allows
mutators (setlocale(), uselocale()) in multi-threaded programs.
So that multi-threaded applications can modify the program's locale after
startup, there is the setlocale() function.
So that multi-threaded programs can have a locale per thread, there is a
uselocale() function.

(C) is an application layer that happens to be in Glibc for convenience
reasons. It is based on the layer (B).


Back to the LANGUAGE environment variable. The problem is that here we
have the layers (A) and (C), but (B) is missing. The solution ought to
be to introduce a layer (B) for LANGUAGE. LANGUAGE is not specified by
POSIX and does not perfectly fit into the locale system, therefore I
believe it is best treated separately.

So, what I imagine is a layer (B) with an API like this:

  /* Returns the language precedence list for the program. */
  const char *get_i18n_language (void);

  /* Sets the language precedence list for the program.
     NULL means to use the one inferred from the environment variable. */
  void set_i18n_language (const char *);

or - if you want to have a language per thread -:

  /* Returns the language precedence list for the current thread. */
  const char *get_i18n_language (void);

  /* Sets the language precedence list for the program.
     NULL means to use the one inferred from the environment variable. */
  void set_i18n_language (const char *);

  /* Sets the language precedence list for the current thread.
     NULL means to use the one for the program or, if not set,
     the one inferred from the environment variable. */
  void set_thread_i18n_language (const char *);

You can protect the implementation of these functions with locks
(functions/macros gl_rwlock_*).


With this approach,
  - Multithread program can change the i18n language in a thread-safe
    way, without using setenv().
  - The setlocale() code is left alone.

> To mitigate this, I was wondering if it would be possible to move the
> getenv() call from gettext() to setlocale(), which is typically called
> at program startup, and cache the value in a static variable.
> 
> The attached patch is an experiment implementing that in libintl, though
> in practice it would require a change in glibc's setlocale
> implementation.

I see two drawbacks of this patch:

  * It does not solve the root of the problem, namely the violation of the
    rule "multi-threaded programs should not call setenv".

  * It modifies the code of setlocale() for a purpose that is unrelated to
    the locale system.

    The fact that glibc's locale/setlocale.c has to increment _nl_msg_cat_cntr
    (notification from layer (B) to layer (C)) is already bad enough; it
    exists because there is no standardized API for being notified of
    locale changes. It forces us to override setlocale on non-glibc systems,
    using gnulibology patterns.

    But adding yet another call from layer (B) to layer (C) is even more of a
    hack.

Bruno

Re: [bug-gettext] RFC: move LANGUAGE check out of gettext()

Reply via email to