Re: Windows default locale vs initdb

Thomas Munro Tue, 06 Aug 2024 21:14:27 -0700

On Tue, Jul 23, 2024 at 11:19 AM Thomas Munro <thomas.mu...@gmail.com> wrote:
> On Tue, Jul 23, 2024 at 1:44 AM Andrew Dunstan <and...@dunslane.net> wrote:
> > I have an environment I can use for testing. But what exactly am I
> > testing? :-) Install a few "problem" language/region settings, switch
> > the system and ensure initdb runs ok?


I thought a bit more about what to do with the messy .UTF-8 situation
on Windows, and I think I might see a way forward that harmonises the
code and behaviour with Unix, and deletes a lot of special case code.
But it's only theories + CI so far.

0001, 0002:  As before, teach initdb.exe to choose eg "en-US" by default.

0003:  Force people to choose locales that match the database
encoding, as we do on Unix.  That is, forbid contradictory
combinations like --locale="English_United States.1252"
--encoding=UTF8, which are currently allowed (and the world is full of
such database clusters because that is how the EDB installer GUI makes
them).  The only allowed combinations for American English should now
be: --locale="en-US" --encoding="WIN1252", and --locale="en-US.UTF-8"
--encoding="UTF8".  You can still use the old names if you like, by
explicitly writing --locale="English_United States.1252", but the
encoding then has to be WIN1252.  It's crazy to mix them up, let's ban
that.

Obviously there is a pg_upgrade case to worry about there.  We'd have
to "fix" the now illegal combinations, and I don't know exactly how
yet.

0004:  Rip out the code that does extra wchar_t conversations for
collations.  If I've understood correctly, we don't need them: if you
have a .UTF-8 locale then your encoding is UTF-8 and should be able to
use strcoll_l() directly.  Right?

0005:  Something similar was being done for strftime().  And we might
as well use strftime_l() instead while we're here (part of general
movement to use _l functions and stop splattering setlocale() all over
the place, for the multithreaded future).

These patches pass on CI.  Do they give the expected results when used
on a real Windows system?

There are a few more places where we do wchar_t conversions that could
probably be stripped out too, if my assumptions are correct, and we
could dig further if the basic idea can be validated and people think
this is going in a good direction.

From 886815244ab43092562ae3118cd5588a2fad5bb2 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Mon, 20 Nov 2023 14:24:35 +1300
Subject: [PATCH v6 1/5] MinGW has GetLocaleInfoEx().

To use BCP 47 locale names like "en-US" without a suffix ".encoding", we
need to be able to call GetLocaleInfoEx() to look up the encoding.  That
was previously gated for MSVC only, but MinGW has had the function for
many years.  Remove that gating, because otherwise our MinGW build farm
animals would fail when a later commit switches to using the new names by
default.

There are probably other places where _MSC_VER is being used as a proxy
for detecting MinGW with an out-of-date idea about missing functions.

Discussion: https://postgr.es/m/CA%2BhUKGLsV3vTjPp7bOZBr3JTKp3Brkr9V0Qfmc7UvpWcmAQL4A%40mail.gmail.com
---
 src/port/chklocale.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/port/chklocale.c b/src/port/chklocale.c
index 8cb81c8640..a15b0d5349 100644
--- a/src/port/chklocale.c
+++ b/src/port/chklocale.c
@@ -204,7 +204,6 @@ win32_langinfo(const char *ctype)
 	char	   *r = NULL;
 	char	   *codepage;

-#if defined(_MSC_VER)
 	uint32		cp;
 	WCHAR		wctype[LOCALE_NAME_MAX_LENGTH];

@@ -229,7 +228,6 @@ win32_langinfo(const char *ctype)
 		}
 	}
 	else
-#endif
 	{
 		/*
 		 * Locale format on Win32 is <Language>_<Country>.<CodePage>.  For
-- 
2.39.2

From 357751c04cdd3dc7dea1ee9409356d818af70d5d Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Tue, 19 Jul 2022 06:31:17 +1200
Subject: [PATCH v6 2/5] Default to IETF BCP 47 locale names in initdb on
 Windows.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Avoid selecting traditional Windows locale names written with English
words, because (1) they are unstable and explicitly not recommended for
use in databases and (2) they may contain non-ASCII characters, which we
can't put in our shared catalogs.  Since setlocale() returns such names,
on Windows use GetUserDefaultLocaleName() if the user didn't provide an
explicit locale.  It returns BCP 47 strings like "en-US".

Also update the documentation to recommend BCP 47 over the traditional
names when providing explicit values to initdb.

Reviewed-by: Juan José Santamaría Flecha <juanjo.santama...@gmail.com>
Reviewed-by:
Discussion: https://postgr.es/m/CA%2BhUKGJ%3DXThErgAQRoqfCy1bKPxXVuF0%3D2zDbB%2BSxDs59pv7Fw%40mail.gmail.com
---
 doc/src/sgml/charset.sgml | 13 +++++++++++--
 src/bin/initdb/initdb.c   | 31 +++++++++++++++++++++++++++++--
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 834cb30c85..adb21eb079 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -83,8 +83,17 @@ initdb --locale=sv_SE
     system under what names depends on what was provided by the operating
     system vendor and what was installed.  On most Unix systems, the command
     <literal>locale -a</literal> will provide a list of available locales.
-    Windows uses more verbose locale names, such as <literal>German_Germany</literal>
-    or <literal>Swedish_Sweden.1252</literal>, but the principles are the same.
+   </para>
+
+   <para>
+    Windows uses BCP 47 language tags, like ICU.
+    For example, <literal>sv-SE</literal> represents Swedish as spoken in Sweden.
+    Windows also supports more verbose locale names based on full names
+    such as <literal>German_Germany</literal> or <literal>Swedish_Sweden.1252</literal>,
+    but these are not recommended because they are not stable across operating
+    system updates due to changes in geographical names, and may contain
+    non-ASCII characters which are not supported in PostgreSQL's shared
+    catalogs.
    </para>
 
    <para>
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index f00718a015..393232b6ce 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -64,6 +64,10 @@
 #include "sys/mman.h"
 #endif
 
+#ifdef WIN32
+#include <winnls.h>
+#endif
+
 #include "access/xlog_internal.h"
 #include "catalog/pg_authid_d.h"
 #include "catalog/pg_class_d.h" /* pgrminclude ignore */
@@ -2132,6 +2136,7 @@ locale_date_order(const char *locale)
 static void
 check_locale_name(int category, const char *locale, char **canonname)
 {
+	char	   *locale_copy;
 	char	   *save;
 	char	   *res;
 
@@ -2147,10 +2152,30 @@ check_locale_name(int category, const char *locale, char **canonname)
 
 	/* for setlocale() call */
 	if (!locale)
-		locale = "";
+	{
+#ifdef WIN32
+		wchar_t		wide_name[LOCALE_NAME_MAX_LENGTH];
+		char		name[LOCALE_NAME_MAX_LENGTH];
+
+		/* use Windows API to find the default in BCP47 format */
+		if (GetUserDefaultLocaleName(wide_name, LOCALE_NAME_MAX_LENGTH) == 0)
+			pg_fatal("failed to get default locale name: error code %lu",
+					 GetLastError());
+		if (WideCharToMultiByte(CP_ACP, 0, wide_name, -1, name,
+								LOCALE_NAME_MAX_LENGTH, NULL, NULL) == 0)
+			pg_fatal("failed to convert locale name: error code %lu",
+					 GetLastError());
+		locale_copy = pg_strdup(name);
+#else
+		/* use environment to find the default */
+		locale_copy = pg_strdup("");
+#endif
+	}
+	else
+		locale_copy = pg_strdup(locale);
 
 	/* set the locale with setlocale, to see if it accepts it. */
-	res = setlocale(category, locale);
+	res = setlocale(category, locale_copy);
 
 	/* save canonical name if requested. */
 	if (res && canonname)
@@ -2183,6 +2208,8 @@ check_locale_name(int category, const char *locale, char **canonname)
 			pg_fatal("invalid locale settings; check LANG and LC_* environment variables");
 		}
 	}
+
+	free(locale_copy);
 }
 
 /*
-- 
2.39.2

From a4b0b0324900d12d487370a08b4ddba20552e230 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Wed, 7 Aug 2024 10:23:05 +1200
Subject: [PATCH v6 3/5] Don't allow UTF-8 with non-UTF-8 locales on Windows.

Historically, we allowed contradictions such as:

initdb.exe --locale="French_France.1252" --encoding="UTF-8"

That's because Windows didn't support UTF-8 directly, and PostgreSQL
had to perform UTF-8 char -> wchar_t conversions at various places on
that OS, and still does.  Therefore it was never actually passing
UTF-8 text to operating system facilities.

In preparation for removing those code paths, harmonizing the code
and behavior with Unix builds, and adapting to modern Windows
interfaces, ban such contradictions.  Locale names should ideally be
specified as BCP 47 tags.  If UTF-8 is desired, the name should have
".UTF-8" on the end, but otherwise the traditional encoding of that
language is implied.  Now only the following are valid:

initdb.exe --locale="fr-FR" --encoding="WIN1252"
initdb.exe --locale="fr-FR.UTF-8" --encoding="UTF-8"
initdb.exe --locale="French_France.1252" --encoding="WIN1252"

(The last form is not recommended, but still accepted.)

XXX This will cause problems for clusters upgraded with pg_upgrade from
a system using locales with the wrong encoding.  We'll need a way to
translate to the correct modern locale names.

Discussion: https://postgr.es/m/CA%2BhUKGJ%3DXThErgAQRoqfCy1bKPxXVuF0%3D2zDbB%2BSxDs59pv7Fw%40mail.gmail.com
---
 src/backend/commands/dbcommands.c | 12 +-----------
 src/bin/initdb/initdb.c           | 11 -----------
 2 files changed, 1 insertion(+), 22 deletions(-)

diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 7026352bc9..566085fecc 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -1555,11 +1555,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
  * 2. locale encoding = -1, which means that we couldn't determine the
  * locale's encoding and have to trust the user to get it right.
  *
- * 3. selected encoding is UTF8 and platform is win32. This is because
- * UTF8 is a pseudo codepage that is supported in all locales since it's
- * converted to UTF16 before being used.
- *
- * 4. selected encoding is SQL_ASCII, but only if you're a superuser. This
+ * 3. selected encoding is SQL_ASCII, but only if you're a superuser. This
  * is risky but we have historically allowed it --- notably, the
  * regression tests require it.
  *
@@ -1574,9 +1570,6 @@ check_encoding_locale_matches(int encoding, const char *collate, const char *cty
 	if (!(ctype_encoding == encoding ||
 		  ctype_encoding == PG_SQL_ASCII ||
 		  ctype_encoding == -1 ||
-#ifdef WIN32
-		  encoding == PG_UTF8 ||
-#endif
 		  (encoding == PG_SQL_ASCII && superuser())))
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1589,9 +1582,6 @@ check_encoding_locale_matches(int encoding, const char *collate, const char *cty
 	if (!(collate_encoding == encoding ||
 		  collate_encoding == PG_SQL_ASCII ||
 		  collate_encoding == -1 ||
-#ifdef WIN32
-		  encoding == PG_UTF8 ||
-#endif
 		  (encoding == PG_SQL_ASCII && superuser())))
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 393232b6ce..77bf815919 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -2228,9 +2228,6 @@ check_locale_encoding(const char *locale, int user_enc)
 	if (!(locale_enc == user_enc ||
 		  locale_enc == PG_SQL_ASCII ||
 		  locale_enc == -1 ||
-#ifdef WIN32
-		  user_enc == PG_UTF8 ||
-#endif
 		  user_enc == PG_SQL_ASCII))
 	{
 		pg_log_error("encoding mismatch");
@@ -2695,13 +2692,6 @@ setup_locale_encoding(void)
 			 * Windows, UTF-8 works with any locale, so we can fall back to
 			 * UTF-8.
 			 */
-#ifdef WIN32
-			encodingid = PG_UTF8;
-			printf(_("Encoding \"%s\" implied by locale is not allowed as a server-side encoding.\n"
-					 "The default database encoding will be set to \"%s\" instead.\n"),
-				   pg_encoding_to_char(ctype_enc),
-				   pg_encoding_to_char(encodingid));
-#else
 			pg_log_error("locale \"%s\" requires unsupported encoding \"%s\"",
 						 lc_ctype, pg_encoding_to_char(ctype_enc));
 			pg_log_error_detail("Encoding \"%s\" is not allowed as a server-side encoding.",
@@ -2709,7 +2699,6 @@ setup_locale_encoding(void)
 			pg_log_error_hint("Rerun %s with a different locale selection.",
 							  progname);
 			exit(1);
-#endif
 		}
 		else
 		{
-- 
2.39.2

From 5e8689b50db21fe5adfcee15f54524eefe64c492 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Wed, 7 Aug 2024 10:36:15 +1200
Subject: [PATCH v6 4/5] Collate UTF-8 without wchar_t conversion in Windows.

Traditionally, Windows didn't support UTF-8 encoding in system
interfaces, and we had to convert to UTF-16 and use wcscoll_l().
Windows 10+ has UTF-8 support, and an earlier commit banned the use of
locales with encoding that doesn't match the database, so we can now
harmonize with the Unix code paths and just call strcoll_l().

Discussion: https://postgr.es/m/CA%2BhUKGJ%3DXThErgAQRoqfCy1bKPxXVuF0%3D2zDbB%2BSxDs59pv7Fw%40mail.gmail.com
---
 src/backend/utils/adt/pg_locale.c | 90 +------------------------------
 1 file changed, 1 insertion(+), 89 deletions(-)

diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index cd3661e727..4d3c3e4e75 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -1804,78 +1804,6 @@ get_collation_actual_version(char collprovider, const char *collcollate)
 	return collversion;
 }
 
-/*
- * pg_strncoll_libc_win32_utf8
- *
- * Win32 does not have UTF-8. Convert UTF8 arguments to wide characters and
- * invoke wcscoll_l().
- */
-#ifdef WIN32
-static int
-pg_strncoll_libc_win32_utf8(const char *arg1, size_t len1, const char *arg2,
-							size_t len2, pg_locale_t locale)
-{
-	char		sbuf[TEXTBUFLEN];
-	char	   *buf = sbuf;
-	char	   *a1p,
-			   *a2p;
-	int			a1len = len1 * 2 + 2;
-	int			a2len = len2 * 2 + 2;
-	int			r;
-	int			result;
-
-	Assert(locale->provider == COLLPROVIDER_LIBC);
-	Assert(GetDatabaseEncoding() == PG_UTF8);
-#ifndef WIN32
-	Assert(false);
-#endif
-
-	if (a1len + a2len > TEXTBUFLEN)
-		buf = palloc(a1len + a2len);
-
-	a1p = buf;
-	a2p = buf + a1len;
-
-	/* API does not work for zero-length input */
-	if (len1 == 0)
-		r = 0;
-	else
-	{
-		r = MultiByteToWideChar(CP_UTF8, 0, arg1, len1,
-								(LPWSTR) a1p, a1len / 2);
-		if (!r)
-			ereport(ERROR,
-					(errmsg("could not convert string to UTF-16: error code %lu",
-							GetLastError())));
-	}
-	((LPWSTR) a1p)[r] = 0;
-
-	if (len2 == 0)
-		r = 0;
-	else
-	{
-		r = MultiByteToWideChar(CP_UTF8, 0, arg2, len2,
-								(LPWSTR) a2p, a2len / 2);
-		if (!r)
-			ereport(ERROR,
-					(errmsg("could not convert string to UTF-16: error code %lu",
-							GetLastError())));
-	}
-	((LPWSTR) a2p)[r] = 0;
-
-	errno = 0;
-	result = wcscoll_l((LPWSTR) a1p, (LPWSTR) a2p, locale->info.lt);
-	if (result == 2147483647)	/* _NLSCMPERROR; missing from mingw headers */
-		ereport(ERROR,
-				(errmsg("could not compare Unicode strings: %m")));
-
-	if (buf != sbuf)
-		pfree(buf);
-
-	return result;
-}
-#endif							/* WIN32 */
-
 /*
  * pg_strcoll_libc
  *
@@ -1891,17 +1819,7 @@ pg_strcoll_libc(const char *arg1, const char *arg2, pg_locale_t locale)
 	int			result;
 
 	Assert(locale->provider == COLLPROVIDER_LIBC);
-#ifdef WIN32
-	if (GetDatabaseEncoding() == PG_UTF8)
-	{
-		size_t		len1 = strlen(arg1);
-		size_t		len2 = strlen(arg2);
-
-		result = pg_strncoll_libc_win32_utf8(arg1, len1, arg2, len2, locale);
-	}
-	else
-#endif							/* WIN32 */
-		result = strcoll_l(arg1, arg2, locale->info.lt);
+	result = strcoll_l(arg1, arg2, locale->info.lt);
 
 	return result;
 }
@@ -1925,12 +1843,6 @@ pg_strncoll_libc(const char *arg1, size_t len1, const char *arg2, size_t len2,
 
 	Assert(locale->provider == COLLPROVIDER_LIBC);
 
-#ifdef WIN32
-	/* check for this case before doing the work for nul-termination */
-	if (GetDatabaseEncoding() == PG_UTF8)
-		return pg_strncoll_libc_win32_utf8(arg1, len1, arg2, len2, locale);
-#endif							/* WIN32 */
-
 	if (bufsize1 + bufsize2 > TEXTBUFLEN)
 		buf = palloc(bufsize1 + bufsize2);
 
-- 
2.39.2

From 05a451df747f219192fbf79d833ac50285048dbf Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Wed, 7 Aug 2024 11:33:22 +1200
Subject: [PATCH v6 5/5] Format times without wchar_t conversion in Windows.

Previously we allowed the locale to be set to something that used an
encoding that didn't match the database.  We have disallowed that now,
so we can use strftime() directly.  And if we're going to touch that
code, we might as well use strftime_l() instead and skip some ugly
save/restore of global state.

strftime_l() is from POSIX 2008.  All supported systems have it, thought
Windows has a leading underscore.

For the CI MinGW cross-build warning check to pass, add -lucrt because
otherwise strftime_l() is not available.

Discussion: https://postgr.es/m/CA%2BhUKGJ%3DXThErgAQRoqfCy1bKPxXVuF0%3D2zDbB%2BSxDs59pv7Fw%40mail.gmail.com
---
 .cirrus.tasks.yml                 |   1 +
 src/backend/utils/adt/pg_locale.c | 147 ++++--------------------------
 src/include/port/win32_port.h     |   1 +
 3 files changed, 21 insertions(+), 128 deletions(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index 1ce6c443a8..3bf81ed4af 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -753,6 +753,7 @@ task:
         --host=x86_64-w64-mingw32 \
         --enable-cassert \
         --without-icu \
+        LDFLAGS="-lucrt" \
         CC="ccache x86_64-w64-mingw32-gcc" \
         CXX="ccache x86_64-w64-mingw32-g++"
       make -s -j${BUILD_JOBS} clean
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 4d3c3e4e75..5e64470b58 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -174,6 +174,8 @@ static void icu_set_collation_attributes(UCollator *collator, const char *loc,
 										 UErrorCode *status);
 #endif
 
+static void report_newlocale_failure(const char *localename);
+
 /*
  * POSIX doesn't define _l-variants of these functions, but several systems
  * have them.  We provide our own replacements here.
@@ -732,65 +734,6 @@ PGLC_localeconv(void)
 	return &CurrentLocaleConv;
 }
 
-#ifdef WIN32
-/*
- * On Windows, strftime() returns its output in encoding CP_ACP (the default
- * operating system codepage for the computer), which is likely different
- * from SERVER_ENCODING.  This is especially important in Japanese versions
- * of Windows which will use SJIS encoding, which we don't support as a
- * server encoding.
- *
- * So, instead of using strftime(), use wcsftime() to return the value in
- * wide characters (internally UTF16) and then convert to UTF8, which we
- * know how to handle directly.
- *
- * Note that this only affects the calls to strftime() in this file, which are
- * used to get the locale-aware strings. Other parts of the backend use
- * pg_strftime(), which isn't locale-aware and does not need to be replaced.
- */
-static size_t
-strftime_win32(char *dst, size_t dstlen,
-			   const char *format, const struct tm *tm)
-{
-	size_t		len;
-	wchar_t		wformat[8];		/* formats used below need 3 chars */
-	wchar_t		wbuf[MAX_L10N_DATA];
-
-	/*
-	 * Get a wchar_t version of the format string.  We only actually use
-	 * plain-ASCII formats in this file, so we can say that they're UTF8.
-	 */
-	len = MultiByteToWideChar(CP_UTF8, 0, format, -1,
-							  wformat, lengthof(wformat));
-	if (len == 0)
-		elog(ERROR, "could not convert format string from UTF-8: error code %lu",
-			 GetLastError());
-
-	len = wcsftime(wbuf, MAX_L10N_DATA, wformat, tm);
-	if (len == 0)
-	{
-		/*
-		 * wcsftime failed, possibly because the result would not fit in
-		 * MAX_L10N_DATA.  Return 0 with the contents of dst unspecified.
-		 */
-		return 0;
-	}
-
-	len = WideCharToMultiByte(CP_UTF8, 0, wbuf, len, dst, dstlen - 1,
-							  NULL, NULL);
-	if (len == 0)
-		elog(ERROR, "could not convert string to UTF-8: error code %lu",
-			 GetLastError());
-
-	dst[len] = '\0';
-
-	return len;
-}
-
-/* redefine strftime() */
-#define strftime(a,b,c,d) strftime_win32(a,b,c,d)
-#endif							/* WIN32 */
-
 /*
  * Subroutine for cache_locale_time().
  * Convert the given string from encoding "encoding" to the database
@@ -829,10 +772,7 @@ cache_locale_time(void)
 	bool		strftimefail = false;
 	int			encoding;
 	int			i;
-	char	   *save_lc_time;
-#ifdef WIN32
-	char	   *save_lc_ctype;
-#endif
+	locale_t	locale;
 
 	/* did we do this already? */
 	if (CurrentLCTimeValid)
@@ -840,50 +780,24 @@ cache_locale_time(void)
 
 	elog(DEBUG3, "cache_locale_time() executed; locale: \"%s\"", locale_time);
 
-	/*
-	 * As in PGLC_localeconv(), it's critical that we not throw error while
-	 * libc's locale settings have nondefault values.  Hence, we just call
-	 * strftime() within the critical section, and then convert and save its
-	 * results afterwards.
-	 */
-
-	/* Save prevailing value of time locale */
-	save_lc_time = setlocale(LC_TIME, NULL);
-	if (!save_lc_time)
-		elog(ERROR, "setlocale(NULL) failed");
-	save_lc_time = pstrdup(save_lc_time);
-
 #ifdef WIN32
-
-	/*
-	 * On Windows, it appears that wcsftime() internally uses LC_CTYPE, so we
-	 * must set it here.  This code looks the same as what PGLC_localeconv()
-	 * does, but the underlying reason is different: this does NOT determine
-	 * the encoding we'll get back from strftime_win32().
-	 */
-
-	/* Save prevailing value of ctype locale */
-	save_lc_ctype = setlocale(LC_CTYPE, NULL);
-	if (!save_lc_ctype)
-		elog(ERROR, "setlocale(NULL) failed");
-	save_lc_ctype = pstrdup(save_lc_ctype);
-
-	/* use lc_time to set the ctype */
-	setlocale(LC_CTYPE, locale_time);
+	locale = _create_locale(LC_ALL, locale_time);
+#else
+	locale = newlocale(LC_ALL, locale_time, NULL);
 #endif
+	if (!locale)
+		report_newlocale_failure(locale_time);
 
-	setlocale(LC_TIME, locale_time);
-
-	/* We use times close to current time as data for strftime(). */
+	/* We use times close to current time as data for strftime_l(). */
 	timenow = time(NULL);
 	timeinfo = localtime(&timenow);
 
-	/* Store the strftime results in MAX_L10N_DATA-sized portions of buf[] */
+	/* Store the strftime_l results in MAX_L10N_DATA-sized portions of buf[] */
 	bufptr = buf;
 
 	/*
 	 * MAX_L10N_DATA is sufficient buffer space for every known locale, and
-	 * POSIX defines no strftime() errors.  (Buffer space exhaustion is not an
+	 * POSIX defines no strftime_l() errors.  (Buffer space exhaustion is not an
 	 * error.)  An implementation might report errors (e.g. ENOMEM) by
 	 * returning 0 (or, less plausibly, a negative value) and setting errno.
 	 * Report errno just in case the implementation did that, but clear it in
@@ -895,10 +809,10 @@ cache_locale_time(void)
 	for (i = 0; i < 7; i++)
 	{
 		timeinfo->tm_wday = i;
-		if (strftime(bufptr, MAX_L10N_DATA, "%a", timeinfo) <= 0)
+		if (strftime_l(bufptr, MAX_L10N_DATA, "%a", timeinfo, locale) <= 0)
 			strftimefail = true;
 		bufptr += MAX_L10N_DATA;
-		if (strftime(bufptr, MAX_L10N_DATA, "%A", timeinfo) <= 0)
+		if (strftime_l(bufptr, MAX_L10N_DATA, "%A", timeinfo, locale) <= 0)
 			strftimefail = true;
 		bufptr += MAX_L10N_DATA;
 	}
@@ -908,39 +822,26 @@ cache_locale_time(void)
 	{
 		timeinfo->tm_mon = i;
 		timeinfo->tm_mday = 1;	/* make sure we don't have invalid date */
-		if (strftime(bufptr, MAX_L10N_DATA, "%b", timeinfo) <= 0)
+		if (strftime_l(bufptr, MAX_L10N_DATA, "%b", timeinfo, locale) <= 0)
 			strftimefail = true;
 		bufptr += MAX_L10N_DATA;
-		if (strftime(bufptr, MAX_L10N_DATA, "%B", timeinfo) <= 0)
+		if (strftime_l(bufptr, MAX_L10N_DATA, "%B", timeinfo, locale) <= 0)
 			strftimefail = true;
 		bufptr += MAX_L10N_DATA;
 	}
 
-	/*
-	 * Restore the prevailing locale settings; as in PGLC_localeconv(),
-	 * failure to do so is fatal.
-	 */
 #ifdef WIN32
-	if (!setlocale(LC_CTYPE, save_lc_ctype))
-		elog(FATAL, "failed to restore LC_CTYPE to \"%s\"", save_lc_ctype);
+	_free_locale(locale);
+#else
+	freelocale(locale);
 #endif
-	if (!setlocale(LC_TIME, save_lc_time))
-		elog(FATAL, "failed to restore LC_TIME to \"%s\"", save_lc_time);
 
 	/*
 	 * At this point we've done our best to clean up, and can throw errors, or
 	 * call functions that might throw errors, with a clean conscience.
 	 */
 	if (strftimefail)
-		elog(ERROR, "strftime() failed: %m");
-
-	/* Release the pstrdup'd locale names */
-	pfree(save_lc_time);
-#ifdef WIN32
-	pfree(save_lc_ctype);
-#endif
-
-#ifndef WIN32
+		elog(ERROR, "strftime_() failed: %m");
 
 	/*
 	 * As in PGLC_localeconv(), we must convert strftime()'s output from the
@@ -951,16 +852,6 @@ cache_locale_time(void)
 	if (encoding < 0)
 		encoding = PG_SQL_ASCII;
 
-#else
-
-	/*
-	 * On Windows, strftime_win32() always returns UTF8 data, so convert from
-	 * that if necessary.
-	 */
-	encoding = PG_UTF8;
-
-#endif							/* WIN32 */
-
 	bufptr = buf;
 
 	/* localized days */
diff --git a/src/include/port/win32_port.h b/src/include/port/win32_port.h
index 7ffe5891c6..87157a1095 100644
--- a/src/include/port/win32_port.h
+++ b/src/include/port/win32_port.h
@@ -450,6 +450,7 @@ extern int	_pglstat64(const char *name, struct stat *buf);
 #define isspace_l _isspace_l
 #define iswspace_l _iswspace_l
 #define strcoll_l _strcoll_l
+#define strftime_l _strftime_l
 #define strxfrm_l _strxfrm_l
 #define wcscoll_l _wcscoll_l
 
-- 
2.39.2

Re: Windows default locale vs initdb

Reply via email to