pattern_char_isalpha() doesn't check for the PG_C_UTF8 builtin
collation provider, and ends up falling through to isalpha() for
characters in the ascii range.

I don't think this is an actual correctness bug, because:

  (a) For all locales I tested on linux and mac, isalpha() has
identical behavior for the ascii range.
  (b) To be an actual correctness bug, it would need to be a false
negative; that is, to say that a character is not case-varying when it
is. The only case-varying characters in the ascii range for PG_C_UTF8
are [A-Za-z], and it seems unlikely that any locale would treat those
as non-alphabetic.

But I I think we should fix and backport to 17, because there's no
reason we should be calling libc at all when using PG_C_UTF8, and it
might cause an issue on some platform that I didn't test.

Fix attached (slightly different on master and 17). I intend to commit
soon.

Regards,
        Jeff Davis

From c65eb2fda1d7c9a29846d61bdb0358a0e73e2226 Mon Sep 17 00:00:00 2001
From: Jeff Davis <j...@j-davis.com>
Date: Wed, 9 Oct 2024 22:28:15 -0700
Subject: [PATCH v17] Fix missed case for builtin collation provider.

A missed check for the builtin collation provider could result in
falling through to call isalpha().

This does not appear to have practical consequences because it only
happens for characters in the ASCII range. Regardless, the builtin
provider should not be calling libc functions, so backpatch.

Backpatch-through: 17
---
 src/backend/utils/adt/like_support.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/like_support.c b/src/backend/utils/adt/like_support.c
index 2635050861..6cd21ba8fe 100644
--- a/src/backend/utils/adt/like_support.c
+++ b/src/backend/utils/adt/like_support.c
@@ -1505,7 +1505,7 @@ pattern_char_isalpha(char c, bool is_multibyte,
 		return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
 	else if (is_multibyte && IS_HIGHBIT_SET(c))
 		return true;
-	else if (locale && locale->provider == COLLPROVIDER_ICU)
+	else if (locale && locale->provider != COLLPROVIDER_LIBC)
 		return IS_HIGHBIT_SET(c) ||
 			(c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
 	else if (locale && locale->provider == COLLPROVIDER_LIBC)
-- 
2.34.1

From 51c86d422942152c960d02c7478483d3b21f1390 Mon Sep 17 00:00:00 2001
From: Jeff Davis <j...@j-davis.com>
Date: Wed, 9 Oct 2024 22:28:15 -0700
Subject: [PATCH v18] Fix missed case for builtin collation provider.

A missed check for the builtin collation provider could result in
falling through to call isalpha().

This does not appear to have practical consequences because it only
happens for characters in the ASCII range. Regardless, the builtin
provider should not be calling libc functions, so backpatch.

Backpatch-through: 17
---
 src/backend/utils/adt/like_support.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/like_support.c b/src/backend/utils/adt/like_support.c
index 79c4ddc757..8b15509a3b 100644
--- a/src/backend/utils/adt/like_support.c
+++ b/src/backend/utils/adt/like_support.c
@@ -1500,13 +1500,11 @@ pattern_char_isalpha(char c, bool is_multibyte,
 		return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
 	else if (is_multibyte && IS_HIGHBIT_SET(c))
 		return true;
-	else if (locale->provider == COLLPROVIDER_ICU)
+	else if (locale->provider != COLLPROVIDER_LIBC)
 		return IS_HIGHBIT_SET(c) ||
 			(c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
-	else if (locale->provider == COLLPROVIDER_LIBC)
-		return isalpha_l((unsigned char) c, locale->info.lt);
 	else
-		return isalpha((unsigned char) c);
+		return isalpha_l((unsigned char) c, locale->info.lt);
 }
 
 
-- 
2.34.1

Reply via email to