Replying to Peter and Jeff in one email. On Sat, Nov 12, 2022 at 3:57 AM Peter Eisentraut <peter.eisentr...@enterprisedb.com> wrote: > On 22.10.22 03:22, Thomas Munro wrote: > > I'd love to hear others' thoughts on how we can turn this into a > > workable solution. Hopefully while staying simple... > > I played with this patch a bit. It looks like a reasonable approach.
Great news. > Attached is a small patch to get the dynamic libicu* lookup working with > the library naming on macOS. Thanks, squashed. > Instead of packing the ICU version into the locale field ('63:en'), I > would make it a separate field in pg_collation and a separate argument > in CREATE COLLATION. I haven't tried this yet, as I focused on coming up with a way of testing in this iteration. I can try this next. I'm imagining that we'd have pg_collation.collicuversion and pg_database.daticuversion, and they'd default to 0 for "use the GUC", and perhaps you'd even be able to ALTER them. Perhaps we wouldn't even need the GUC then... 0 could mean "the linked version", and if you don't like it, you ALTER it. Thinking about this. > At this point, perhaps it would be good to start building some tests to > demonstrate various upgrade scenarios and to ensure portability. OK, here's what I came up with. You enable it in PG_TEST_EXTRA, and tell it about an alternative ICU version you have in the standard library search path that is not the same as the main/linked one: $ meson configure -DPG_TEST_EXTRA="icu=63" $ meson test icu/020_multiversion Another change from your feedback: you mentioned that RHEL7 shipped with ICU 50, so I removed my suggestion of dropping some extra code we carry for versions before 54 and set the minimum acceptable version to 50. It probably works further back than that, but that's a decent range, I think. On Tue, Nov 15, 2022 at 1:55 PM Jeff Davis <pg...@j-davis.com> wrote: > I looked at v6. Thanks for jumping in and testing! > * We'll need some clearer instructions on how to build/install extra > ICU versions that might not be provided by the distribution packaging. > For instance, I got a cryptic error until I used --enable-rpath, which > might not be obvious to all users. Suggestions welcome. No docs at all yet... > * Can we have a better error when the library was built with -- > disable-renaming? We can just search for the plain (no suffix) symbol. I threw out that symbol probing logic, and wrote something simpler that should now also work with --disable-renaming (though not tested). Now it does a cross-check with the library's self-reported major version, just to make sure there wasn't a badly named library file, which may be more likely with --disable-renaming. > * We should use dlerror() instead of %m to report dlopen() errors. Fixed. > * It seems like the collation version is just there to issue WARNINGs > when a user is using the non-versioned locale syntax and the library > changes underneath them (or if there is collation version change within > a single ICU major version)? Correct. I have now updated the warning messages you get when they don't match, to provide a hint about what to do about it. I am sure they need some more word-smithing, though. > * How are you testing this? Ad hoc noodling before now, but see attached. > I realize your patch is experimental, but when there is a better > consensus on the approach, we should consider adding declarative syntax > such as: > > CREATE COLLATION (or LOCALE?) PROVIDER icu67 > TYPE icu VERSION '67' AS '/path/to/icui18n.so.67'; > > It will offer more opportunities to catch errors early and offer better > error messages. It would also enable it to function if the library is > built with --disable-renaming (though we'd have to trust the user). Earlier in this and other threads, we wondered if each ICU major version should be a separate provider, which is what you're showing there, or should be an independent property of an individual COLLATION, which is what v6 did with '63:en' and what Peter suggested I make more formal with CREATE COLLATION foo (..., ICU_VERSION=63). I actually started out thinking we'd have multiple providers, but I couldn't really think of any advantage, and I think it makes some upgrade scenarios more painful. Can you elaborate on why you'd want that model? > On Sat, 2022-10-22 at 14:22 +1300, Thomas Munro wrote: > > Problem 1: Suppose you're ready to start using (say) v72. I guess > > you'd use the REFRESH command, which would open the main linked ICU's > > collversion and stamp that into the catalogue, at which point new > > sessions would start using that, and then you'd have to rebuild all > > your indexes (with no help from PG to tell you how to find everything > > that needs to be rebuilt, as belaboured in previous reverted work). > > Aside from the possibility of getting the rebuilding job wrong (as > > belaboured elsewhere), it's not great, because there is still a > > transitional period where you can be using the wrong version for your > > data. So this requires some careful planning and understanding from > > the administrator. > > How is this related to the search-by-collversion design? It seems like > it's hard no matter what. Yeah. I just don't like the way it *appears* to be doing something clever, but it doesn't solve any fundamental problem at all because the collversion information is under human control and so it's really doing something stupid. Hence desire to build something that at least admits that it's primitive and just gives you some controls, in a first version. We could always reconsider that in later work though, maybe even an optional policy or something?
From 51f0e2eaaf8e941033ad4ba7e412fc900636962d Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Wed, 8 Jun 2022 17:43:53 +1200 Subject: [PATCH v7] WIP: Multi-version ICU. Add a layer of indirection when accessing ICU, so that multiple major versions of the library can be used at once. Versions other than the one that PostgreSQL was linked against are opened with dlopen(), but we refuse to open version higher than the one were were compiled against. The ABI might change in future releases so that wouldn't be safe. By default, the system linker's default search path is used to find libraries, but icu_library_path may be used to specify an absolute path to look in. ICU libraries are expected to have been built without ICU's --disable-renaming option. That is, major versions must use distinct symbol names. This arrangement means that at least one major version of ICU is always available -- the one that PostgreSQL was linked again. It should be simple on most software distributions to install extra versions using a package manager, or to build extra libraries as required, to access older ICU releases. For example, on Debian bullseye the packages are named libicu63, libicu67, libicu71. In this version of the patch, '63:en' used as a database default locale or COLLATION object requests ICU library 63, and 'en' requests the library version seleted by the GUC default_icu_library_version, defaulting to the version that the executable is linked against. XXX Many other designs possible, to discuss! Reviewed-by: Peter Eisentraut <peter.eisentr...@enterprisedb.com> Reviewed-by: Jeff Davis <pg...@j-davis.com> Discussion: https://postgr.es/m/CA%2BhUKGL4VZRpP3CkjYQkv4RQ6pRYkPkSNgKSxFBwciECQ0mEuQ%40mail.gmail.com --- src/backend/access/hash/hashfunc.c | 16 +- src/backend/commands/collationcmds.c | 20 + src/backend/utils/adt/formatting.c | 53 +- src/backend/utils/adt/pg_locale.c | 451 +++++++++++++++++- src/backend/utils/adt/varchar.c | 16 +- src/backend/utils/adt/varlena.c | 56 +-- src/backend/utils/init/postinit.c | 48 +- src/backend/utils/misc/guc_tables.c | 28 ++ src/backend/utils/misc/postgresql.conf.sample | 5 + src/include/catalog/pg_proc.dat | 3 + src/include/utils/pg_locale.h | 75 +++ src/test/icu/meson.build | 1 + src/test/icu/t/020_multiversion.pl | 203 ++++++++ src/tools/pgindent/typedefs.list | 3 + 14 files changed, 888 insertions(+), 90 deletions(-) create mode 100644 src/test/icu/t/020_multiversion.pl diff --git a/src/backend/access/hash/hashfunc.c b/src/backend/access/hash/hashfunc.c index b57ed946c4..0a61538efd 100644 --- a/src/backend/access/hash/hashfunc.c +++ b/src/backend/access/hash/hashfunc.c @@ -298,11 +298,11 @@ hashtext(PG_FUNCTION_ARGS) ulen = icu_to_uchar(&uchar, VARDATA_ANY(key), VARSIZE_ANY_EXHDR(key)); - bsize = ucol_getSortKey(mylocale->info.icu.ucol, - uchar, ulen, NULL, 0); + bsize = PG_ICU_LIB(mylocale)->getSortKey(PG_ICU_COL(mylocale), + uchar, ulen, NULL, 0); buf = palloc(bsize); - ucol_getSortKey(mylocale->info.icu.ucol, - uchar, ulen, buf, bsize); + PG_ICU_LIB(mylocale)->getSortKey(PG_ICU_COL(mylocale), + uchar, ulen, buf, bsize); result = hash_any(buf, bsize); @@ -355,11 +355,11 @@ hashtextextended(PG_FUNCTION_ARGS) ulen = icu_to_uchar(&uchar, VARDATA_ANY(key), VARSIZE_ANY_EXHDR(key)); - bsize = ucol_getSortKey(mylocale->info.icu.ucol, - uchar, ulen, NULL, 0); + bsize = PG_ICU_LIB(mylocale)->getSortKey(PG_ICU_COL(mylocale), + uchar, ulen, NULL, 0); buf = palloc(bsize); - ucol_getSortKey(mylocale->info.icu.ucol, - uchar, ulen, buf, bsize); + PG_ICU_LIB(mylocale)->getSortKey(PG_ICU_COL(mylocale), + uchar, ulen, buf, bsize); result = hash_any_extended(buf, bsize, PG_GETARG_INT64(1)); diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c index 81e54e0ce6..4fb0c77f38 100644 --- a/src/backend/commands/collationcmds.c +++ b/src/backend/commands/collationcmds.c @@ -853,6 +853,26 @@ pg_import_system_collations(PG_FUNCTION_ARGS) CreateComments(collid, CollationRelationId, 0, icucomment); } + + /* Also create an object pinned to an ICU major version. */ + collid = CollationCreate(psprintf("%s-x-icu-%d", langtag, U_ICU_VERSION_MAJOR_NUM), + nspid, GetUserId(), + COLLPROVIDER_ICU, true, -1, + NULL, NULL, + psprintf("%d:%s", U_ICU_VERSION_MAJOR_NUM, iculocstr), + get_collation_actual_version(COLLPROVIDER_ICU, iculocstr), + true, true); + if (OidIsValid(collid)) + { + ncreated++; + + CommandCounterIncrement(); + + icucomment = get_icu_locale_comment(name); + if (icucomment) + CreateComments(collid, CollationRelationId, 0, + icucomment); + } } } #endif /* USE_ICU */ diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c index 26f498b5df..0c3c7724d7 100644 --- a/src/backend/utils/adt/formatting.c +++ b/src/backend/utils/adt/formatting.c @@ -1599,6 +1599,11 @@ typedef int32_t (*ICU_Convert_Func) (UChar *dest, int32_t destCapacity, const UChar *src, int32_t srcLength, const char *locale, UErrorCode *pErrorCode); +typedef int32_t (*ICU_Convert_BI_Func) (UChar *dest, int32_t destCapacity, + const UChar *src, int32_t srcLength, + UBreakIterator *bi, + const char *locale, + UErrorCode *pErrorCode); static int32_t icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale, @@ -1623,18 +1628,41 @@ icu_convert_case(ICU_Convert_Func func, pg_locale_t mylocale, } if (U_FAILURE(status)) ereport(ERROR, - (errmsg("case conversion failed: %s", u_errorName(status)))); + (errmsg("case conversion failed: %s", + PG_ICU_LIB(mylocale)->errorName(status)))); return len_dest; } +/* + * Like icu_convert_case, but func takes a break iterator (which we don't + * make use of). + */ static int32_t -u_strToTitle_default_BI(UChar *dest, int32_t destCapacity, - const UChar *src, int32_t srcLength, - const char *locale, - UErrorCode *pErrorCode) +icu_convert_case_bi(ICU_Convert_BI_Func func, pg_locale_t mylocale, + UChar **buff_dest, UChar *buff_source, int32_t len_source) { - return u_strToTitle(dest, destCapacity, src, srcLength, - NULL, locale, pErrorCode); + UErrorCode status; + int32_t len_dest; + + len_dest = len_source; /* try first with same length */ + *buff_dest = palloc(len_dest * sizeof(**buff_dest)); + status = U_ZERO_ERROR; + len_dest = func(*buff_dest, len_dest, buff_source, len_source, NULL, + mylocale->info.icu.locale, &status); + if (status == U_BUFFER_OVERFLOW_ERROR) + { + /* try again with adjusted length */ + pfree(*buff_dest); + *buff_dest = palloc(len_dest * sizeof(**buff_dest)); + status = U_ZERO_ERROR; + len_dest = func(*buff_dest, len_dest, buff_source, len_source, NULL, + mylocale->info.icu.locale, &status); + } + if (U_FAILURE(status)) + ereport(ERROR, + (errmsg("case conversion failed: %s", + PG_ICU_LIB(mylocale)->errorName(status)))); + return len_dest; } #endif /* USE_ICU */ @@ -1702,7 +1730,8 @@ str_tolower(const char *buff, size_t nbytes, Oid collid) UChar *buff_conv; len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes); - len_conv = icu_convert_case(u_strToLower, mylocale, + len_conv = icu_convert_case(PG_ICU_LIB(mylocale)->strToLower, + mylocale, &buff_conv, buff_uchar, len_uchar); icu_from_uchar(&result, buff_conv, len_conv); pfree(buff_uchar); @@ -1824,7 +1853,8 @@ str_toupper(const char *buff, size_t nbytes, Oid collid) UChar *buff_conv; len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes); - len_conv = icu_convert_case(u_strToUpper, mylocale, + len_conv = icu_convert_case(PG_ICU_LIB(mylocale)->strToUpper, + mylocale, &buff_conv, buff_uchar, len_uchar); icu_from_uchar(&result, buff_conv, len_conv); pfree(buff_uchar); @@ -1947,8 +1977,9 @@ str_initcap(const char *buff, size_t nbytes, Oid collid) UChar *buff_conv; len_uchar = icu_to_uchar(&buff_uchar, buff, nbytes); - len_conv = icu_convert_case(u_strToTitle_default_BI, mylocale, - &buff_conv, buff_uchar, len_uchar); + len_conv = icu_convert_case_bi(PG_ICU_LIB(mylocale)->strToTitle, + mylocale, + &buff_conv, buff_uchar, len_uchar); icu_from_uchar(&result, buff_conv, len_conv); pfree(buff_uchar); pfree(buff_conv); diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c index 2b42d9ccd8..3cc51a54a8 100644 --- a/src/backend/utils/adt/pg_locale.c +++ b/src/backend/utils/adt/pg_locale.c @@ -58,6 +58,7 @@ #include "catalog/pg_collation.h" #include "catalog/pg_control.h" #include "mb/pg_wchar.h" +#include "miscadmin.h" #include "utils/builtins.h" #include "utils/formatting.h" #include "utils/guc_hooks.h" @@ -69,6 +70,7 @@ #ifdef USE_ICU #include <unicode/ucnv.h> +#include <unicode/ustring.h> #endif #ifdef __GLIBC__ @@ -79,14 +81,35 @@ #include <shlwapi.h> #endif +#include <dlfcn.h> + #define MAX_L10N_DATA 80 +#ifdef USE_ICU + +/* + * We don't want to call into dlopen'd ICU libraries that are newer than the + * one we were compiled and linked against, just in case there is an + * incompatible API change. + */ +#define PG_MAX_ICU_MAJOR_VERSION U_ICU_VERSION_MAJOR_NUM + +/* + * The oldest ICU release we're likely to encounter, and that has all the + * funcitons required. + */ +#define PG_MIN_ICU_MAJOR_VERSION 50 + +#endif + /* GUC settings */ char *locale_messages; char *locale_monetary; char *locale_numeric; char *locale_time; +char *icu_library_path; +int default_icu_library_version; /* * lc_time localization cache. @@ -1398,29 +1421,348 @@ lc_ctype_is_c(Oid collation) return (lookup_collation_cache(collation, true))->ctype_is_c; } +#ifdef USE_ICU + struct pg_locale_struct default_locale; +/* Linked list of ICU libraries we have loaded. */ +static pg_icu_library *icu_library_list = NULL; + +/* + * Free an ICU library. pg_icu_library objects that are successfully + * constructed stick around for the lifetime of the backend, but this is used + * to clean up if initialization fails. + */ +static void +free_icu_library(pg_icu_library *lib) +{ + if (lib->libicui18n_handle) + dlclose(lib->libicui18n_handle); + if (lib->libicuuc_handle) + dlclose(lib->libicuuc_handle); + pfree(lib); +} + +static void * +get_icu_function(void *handle, const char *function, int version) +{ + char function_with_version[80]; + void *result; + + /* + * Try to look it up using the symbols with major versions, but if that + * doesn't work, also try the unversioned name in case the library was + * configured with --disable-renaming. + */ + snprintf(function_with_version, sizeof(function_with_version), "%s_%d", + function, version); + result = dlsym(handle, function_with_version); + + return result ? result : dlsym(handle, function); +} + +/* + * Helper to load a library. + */ +static void * +load_icu_library(pg_icu_library *lib, const char *name) +{ + void *handle; + + handle = dlopen(name, RTLD_NOW | RTLD_GLOBAL); + if (handle == NULL) + { + char message[80]; + + strlcpy(message, dlerror(), sizeof(message)); + free_icu_library(lib); + ereport(ERROR, + (errmsg("could not load library \"%s\": %s", name, message))); + } + + return handle; +} + +/* + * Given an ICU major version number, return the object we need to access it, + * or fail while trying to load it. + */ +static pg_icu_library * +get_icu_library(int major_version) +{ + UVersionInfo versioninfo; + char versioninfostring[U_MAX_VERSION_STRING_LENGTH]; + pg_icu_library *lib; + + /* XXX Move range check into guc_table.c? */ + if (major_version < PG_MIN_ICU_MAJOR_VERSION || + major_version > PG_MAX_ICU_MAJOR_VERSION) + elog(ERROR, + "ICU version must be between %d and %d", + PG_MIN_ICU_MAJOR_VERSION, + PG_MAX_ICU_MAJOR_VERSION); + + /* Try to find it in our list of existing libraries. */ + for (lib = icu_library_list; lib; lib = lib->next) + if (lib->major_version == major_version) + return lib; + + /* Make a new entry. */ + lib = MemoryContextAllocZero(TopMemoryContext, sizeof(*lib)); + if (major_version == U_ICU_VERSION_MAJOR_NUM) + { + /* + * This is the version we were compiled and linked against. Simply + * assign the function pointers. + * + * These assignments will fail to compile if an incompatible API + * change is made to some future version of ICU, at which point we + * might need to consider special treatment for different major + * version ranges, with intermediate trampoline functions. + */ + lib->major_version = major_version; + lib->getLibraryVersion = u_getVersion; + lib->open = ucol_open; + lib->close = ucol_close; + lib->getVersion = ucol_getVersion; + lib->versionToString = u_versionToString; + lib->strcoll = ucol_strcoll; + lib->strcollUTF8 = ucol_strcollUTF8; + lib->getSortKey = ucol_getSortKey; + lib->nextSortKeyPart = ucol_nextSortKeyPart; + lib->setUTF8 = uiter_setUTF8; + lib->errorName = u_errorName; + lib->strToUpper = u_strToUpper; + lib->strToLower = u_strToLower; + lib->strToTitle = u_strToTitle; + + /* + * Also assert the size of a couple of types used as output buffers, + * as a canary to tell us to add extra padding in the (unlikely) event + * that a later release makes these values smaller. + */ + StaticAssertStmt(U_MAX_VERSION_STRING_LENGTH == 20, + "u_versionToString output buffer size changed incompatibly"); + StaticAssertStmt(U_MAX_VERSION_LENGTH == 4, + "ucol_getVersion output buffer size changed incompatibly"); + } + else + { + /* This is an older version, so we'll need to use dlopen(). */ + char libicui18n_name[MAXPGPATH]; + char libicuuc_name[MAXPGPATH]; + + /* + * We don't like to open versions newer than what we're linked + * against, to reduce the risk of an API change biting us. + */ + if (major_version > U_ICU_VERSION_MAJOR_NUM) + elog(ERROR, "ICU major version %d higher than linked version %d, refusing to open", + major_version, U_ICU_VERSION_MAJOR_NUM); + + lib->major_version = major_version; + + /* + * See + * https://unicode-org.github.io/icu/userguide/icu4c/packaging.html#icu-versions + * for conventions on library naming on POSIX and Windows systems. + */ + + /* Load the collation library. */ + snprintf(libicui18n_name, + sizeof(libicui18n_name), +#ifdef WIN32 + "%s%sicui18n%d." DLSUFFIX, + icu_library_path, + icu_library_path[0] ? "\\" : "", +#elif defined(__darwin__) + "%s%slibicui18n.%d" DLSUFFIX, + icu_library_path, + icu_library_path[0] ? "/" : "", +#else + "%s%slibicui18n" DLSUFFIX ".%d", + icu_library_path, + icu_library_path[0] ? "/" : "", +#endif + major_version); + lib->libicui18n_handle = load_icu_library(lib, libicui18n_name); + + /* Load the ctype library. */ + snprintf(libicuuc_name, + sizeof(libicuuc_name), +#ifdef WIN32 + "%s%sicuuc%d." DLSUFFIX, + icu_library_path, + icu_library_path[0] ? "\\" : "", +#elif defined(__darwin__) + "%s%slibicuuc.%d" DLSUFFIX, + icu_library_path, + icu_library_path[0] ? "/" : "", +#else + "%s%slibicuuc" DLSUFFIX ".%d", + icu_library_path, + icu_library_path[0] ? "/" : "", +#endif + major_version); + lib->libicuuc_handle = load_icu_library(lib, libicuuc_name); + + /* Look up all the functions we need. */ + lib->getLibraryVersion = get_icu_function(lib->libicui18n_handle, + "u_getVersion", + major_version); + lib->open = get_icu_function(lib->libicui18n_handle, + "ucol_open", + major_version); + lib->close = get_icu_function(lib->libicui18n_handle, + "ucol_close", + major_version); + lib->getVersion = get_icu_function(lib->libicui18n_handle, + "ucol_getVersion", + major_version); + lib->versionToString = get_icu_function(lib->libicui18n_handle, + "u_versionToString", + major_version); + lib->strcoll = get_icu_function(lib->libicui18n_handle, + "ucol_strcoll", + major_version); + lib->strcollUTF8 = get_icu_function(lib->libicui18n_handle, + "ucol_strcollUTF8", + major_version); + lib->getSortKey = get_icu_function(lib->libicui18n_handle, + "ucol_getSortKey", + major_version); + lib->nextSortKeyPart = get_icu_function(lib->libicui18n_handle, + "ucol_nextSortKeyPart", + major_version); + lib->setUTF8 = get_icu_function(lib->libicui18n_handle, + "uiter_setUTF8", + major_version); + lib->errorName = get_icu_function(lib->libicui18n_handle, + "u_errorName", + major_version); + lib->strToUpper = get_icu_function(lib->libicuuc_handle, + "u_strToUpper", + major_version); + lib->strToLower = get_icu_function(lib->libicuuc_handle, + "u_strToLower", + major_version); + lib->strToTitle = get_icu_function(lib->libicuuc_handle, + "u_strToTitle", + major_version); + if (!lib->getLibraryVersion || + !lib->open || + !lib->close || + !lib->getVersion || + !lib->versionToString || + !lib->strcoll || + !lib->strcollUTF8 || + !lib->getSortKey || + !lib->nextSortKeyPart || + !lib->setUTF8 || + !lib->errorName || + !lib->strToUpper || + !lib->strToLower || + !lib->strToTitle) + { + free_icu_library(lib); + ereport(ERROR, + (errmsg("could not find expected symbols in libraries \"%s\" and \"%s\"", + libicui18n_name, libicuuc_name))); + } + } + + /* + * Check that the library's own u_getVersion() function reports the version + * that we expected. By using atoi() we take only the major part. + */ + lib->getLibraryVersion(versioninfo); + lib->versionToString(versioninfo, versioninfostring); + if (atoi(versioninfostring) != major_version) + { + free_icu_library(lib); + ereport(ERROR, + (errmsg("opened ICU library with major version %d but it reported its own version as %s", + major_version, versioninfostring))); + } + + lib->next = icu_library_list; + icu_library_list = lib; + + return lib; +} + +/* + * Look up the library to use for a given collcollate string. + */ +static pg_icu_library * +get_icu_library_for_collation(const char *collcollate, const char **rest) +{ + int major_version; + char *separator; + char *after_prefix; + + separator = strchr(collcollate, ':'); + + /* + * If it's a traditional value without a prefix, use the default ICU + * library. That's the one we were linked against, or another one if + * default_icu_library_version has been set. + */ + if (separator == NULL) + { + *rest = collcollate; + + if (default_icu_library_version > 0) + major_version = default_icu_library_version; + else + major_version = U_ICU_VERSION_MAJOR_NUM; + return get_icu_library(major_version); + } + + /* If it has a prefix, interpret it as an ICU major version. */ + major_version = strtol(collcollate, &after_prefix, 10); + if (after_prefix != separator) + elog(ERROR, + "could not parse ICU major library version: \"%s\"", + collcollate); + if (major_version < PG_MIN_ICU_MAJOR_VERSION || + major_version > PG_MAX_ICU_MAJOR_VERSION) + elog(ERROR, + "ICU major library verision out of supported range: \"%s\"", + collcollate); + + /* The part after the separate will be passed to the library. */ + *rest = separator + 1; + + return get_icu_library(major_version); +} + +#endif + void make_icu_collator(const char *iculocstr, struct pg_locale_struct *resultp) { #ifdef USE_ICU + pg_icu_library *lib; UCollator *collator; UErrorCode status; + lib = get_icu_library_for_collation(iculocstr, &iculocstr); status = U_ZERO_ERROR; - collator = ucol_open(iculocstr, &status); + collator = lib->open(iculocstr, &status); if (U_FAILURE(status)) ereport(ERROR, (errmsg("could not open collator for locale \"%s\": %s", - iculocstr, u_errorName(status)))); + iculocstr, lib->errorName(status)))); - if (U_ICU_VERSION_MAJOR_NUM < 54) + if (lib->major_version < 54) icu_set_collation_attributes(collator, iculocstr); /* We will leak this string if the caller errors later :-( */ resultp->info.icu.locale = MemoryContextStrdup(TopMemoryContext, iculocstr); resultp->info.icu.ucol = collator; + resultp->info.icu.lib = lib; #else /* not USE_ICU */ /* could get here if a collation was created by a build with ICU */ ereport(ERROR, @@ -1593,14 +1935,15 @@ pg_newlocale_from_collation(Oid collid) { char *actual_versionstr; char *collversionstr; + char *locale; collversionstr = TextDatumGetCString(datum); datum = SysCacheGetAttr(COLLOID, tp, collform->collprovider == COLLPROVIDER_ICU ? Anum_pg_collation_colliculocale : Anum_pg_collation_collcollate, &isnull); Assert(!isnull); + locale = TextDatumGetCString(datum); - actual_versionstr = get_collation_actual_version(collform->collprovider, - TextDatumGetCString(datum)); + actual_versionstr = get_collation_actual_version(collform->collprovider, locale); if (!actual_versionstr) { /* @@ -1614,17 +1957,44 @@ pg_newlocale_from_collation(Oid collid) } if (strcmp(actual_versionstr, collversionstr) != 0) - ereport(WARNING, - (errmsg("collation \"%s\" has version mismatch", - NameStr(collform->collname)), - errdetail("The collation in the database was created using version %s, " - "but the operating system provides version %s.", - collversionstr, actual_versionstr), - errhint("Rebuild all objects affected by this collation and run " - "ALTER COLLATION %s REFRESH VERSION, " - "or build PostgreSQL with the right library version.", - quote_qualified_identifier(get_namespace_name(collform->collnamespace), - NameStr(collform->collname))))); + { + if (collform->collprovider == COLLPROVIDER_ICU) + { + ereport(WARNING, + (errmsg("collation \"%s\" has version mismatch", + NameStr(collform->collname)), + errdetail("The collation in the database was created using version %s, " + "but the ICU library provides version %s.", + collversionstr, actual_versionstr), + strchr(locale, ':') != NULL ? + errhint("Rebuild all objects affected by this collation and run " + "ALTER COLLATION %s REFRESH VERSION, " + "or build PostgreSQL with the right library version.", + quote_qualified_identifier(get_namespace_name(collform->collnamespace), + NameStr(collform->collname))) : + errhint("Install another version of ICU and select it using " + "default_icu_libary_version, " + "or rebuild all objects affect by this collation and run " + "ALTER COLLATION %s REFRESH VERSION, " + "or build PostgreSQL with the right library version.", + quote_qualified_identifier(get_namespace_name(collform->collnamespace), + NameStr(collform->collname))))); + } + else + { + ereport(WARNING, + (errmsg("collation \"%s\" has version mismatch", + NameStr(collform->collname)), + errdetail("The collation in the database was created using version %s, " + "but the operating system provides version %s.", + collversionstr, actual_versionstr), + errhint("Rebuild all objects affected by this collation and run " + "ALTER COLLATION %s REFRESH VERSION, " + "or build PostgreSQL with the right library version.", + quote_qualified_identifier(get_namespace_name(collform->collnamespace), + NameStr(collform->collname))))); + } + } } ReleaseSysCache(tp); @@ -1651,21 +2021,23 @@ get_collation_actual_version(char collprovider, const char *collcollate) #ifdef USE_ICU if (collprovider == COLLPROVIDER_ICU) { + pg_icu_library *lib; UCollator *collator; UErrorCode status; UVersionInfo versioninfo; char buf[U_MAX_VERSION_STRING_LENGTH]; + lib = get_icu_library_for_collation(collcollate, &collcollate); status = U_ZERO_ERROR; - collator = ucol_open(collcollate, &status); + collator = lib->open(collcollate, &status); if (U_FAILURE(status)) ereport(ERROR, (errmsg("could not open collator for locale \"%s\": %s", - collcollate, u_errorName(status)))); - ucol_getVersion(collator, versioninfo); - ucol_close(collator); + collcollate, lib->errorName(status)))); + lib->getVersion(collator, versioninfo); + lib->close(collator); - u_versionToString(versioninfo, buf); + lib->versionToString(versioninfo, buf); collversion = pstrdup(buf); } else @@ -1733,6 +2105,33 @@ get_collation_actual_version(char collprovider, const char *collcollate) #ifdef USE_ICU + +/* + * Given a major version number, look up that library and ask it for the + * complete version string. + */ +Datum +pg_icu_library_version(PG_FUNCTION_ARGS) +{ +#ifdef USE_ICU + int major_version; + pg_icu_library *lib; + UVersionInfo versioninfo; + char buf[U_MAX_VERSION_STRING_LENGTH]; + + major_version = PG_GETARG_INT32(0); + if (major_version <= 0) + major_version = U_ICU_VERSION_MAJOR_NUM; + + lib = get_icu_library(major_version); + lib->getLibraryVersion(versioninfo); + lib->versionToString(versioninfo, buf); + PG_RETURN_TEXT_P(cstring_to_text(buf)); +#else + PG_RETURN_NULL(); +#endif +} + /* * Converter object for converting between ICU's UChar strings and C strings * in database encoding. Since the database encoding doesn't change, we only @@ -1954,19 +2353,21 @@ void check_icu_locale(const char *icu_locale) { #ifdef USE_ICU + pg_icu_library *lib; UCollator *collator; UErrorCode status; + lib = get_icu_library_for_collation(icu_locale, &icu_locale); status = U_ZERO_ERROR; - collator = ucol_open(icu_locale, &status); + collator = lib->open(icu_locale, &status); if (U_FAILURE(status)) ereport(ERROR, (errmsg("could not open collator for locale \"%s\": %s", - icu_locale, u_errorName(status)))); + icu_locale, lib->errorName(status)))); - if (U_ICU_VERSION_MAJOR_NUM < 54) + if (lib->major_version < 54) icu_set_collation_attributes(collator, icu_locale); - ucol_close(collator); + lib->close(collator); #else ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), diff --git a/src/backend/utils/adt/varchar.c b/src/backend/utils/adt/varchar.c index 68e2e6f7a7..e0c86870e0 100644 --- a/src/backend/utils/adt/varchar.c +++ b/src/backend/utils/adt/varchar.c @@ -1026,11 +1026,11 @@ hashbpchar(PG_FUNCTION_ARGS) ulen = icu_to_uchar(&uchar, keydata, keylen); - bsize = ucol_getSortKey(mylocale->info.icu.ucol, - uchar, ulen, NULL, 0); + bsize = PG_ICU_LIB(mylocale)->getSortKey(PG_ICU_COL(mylocale), + uchar, ulen, NULL, 0); buf = palloc(bsize); - ucol_getSortKey(mylocale->info.icu.ucol, - uchar, ulen, buf, bsize); + PG_ICU_LIB(mylocale)->getSortKey(PG_ICU_COL(mylocale), + uchar, ulen, buf, bsize); result = hash_any(buf, bsize); @@ -1087,11 +1087,11 @@ hashbpcharextended(PG_FUNCTION_ARGS) ulen = icu_to_uchar(&uchar, VARDATA_ANY(key), VARSIZE_ANY_EXHDR(key)); - bsize = ucol_getSortKey(mylocale->info.icu.ucol, - uchar, ulen, NULL, 0); + bsize = PG_ICU_LIB(mylocale)->getSortKey(PG_ICU_COL(mylocale), + uchar, ulen, NULL, 0); buf = palloc(bsize); - ucol_getSortKey(mylocale->info.icu.ucol, - uchar, ulen, buf, bsize); + PG_ICU_LIB(mylocale)->getSortKey(PG_ICU_COL(mylocale), + uchar, ulen, buf, bsize); result = hash_any_extended(buf, bsize, PG_GETARG_INT64(1)); diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index c5e7ee7ca2..cf891a5654 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -1667,13 +1667,14 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid) UErrorCode status; status = U_ZERO_ERROR; - result = ucol_strcollUTF8(mylocale->info.icu.ucol, - arg1, len1, - arg2, len2, - &status); + result = PG_ICU_LIB(mylocale)->strcollUTF8(PG_ICU_COL(mylocale), + arg1, len1, + arg2, len2, + &status); if (U_FAILURE(status)) ereport(ERROR, - (errmsg("collation failed: %s", u_errorName(status)))); + (errmsg("collation failed: %s", + PG_ICU_LIB(mylocale)->errorName(status)))); } else #endif @@ -1686,9 +1687,9 @@ varstr_cmp(const char *arg1, int len1, const char *arg2, int len2, Oid collid) ulen1 = icu_to_uchar(&uchar1, arg1, len1); ulen2 = icu_to_uchar(&uchar2, arg2, len2); - result = ucol_strcoll(mylocale->info.icu.ucol, - uchar1, ulen1, - uchar2, ulen2); + result = PG_ICU_LIB(mylocale)->strcoll(PG_ICU_COL(mylocale), + uchar1, ulen1, + uchar2, ulen2); pfree(uchar1); pfree(uchar2); @@ -2388,13 +2389,14 @@ varstrfastcmp_locale(char *a1p, int len1, char *a2p, int len2, SortSupport ssup) UErrorCode status; status = U_ZERO_ERROR; - result = ucol_strcollUTF8(sss->locale->info.icu.ucol, - a1p, len1, - a2p, len2, - &status); + result = PG_ICU_LIB(sss->locale)->strcollUTF8(PG_ICU_COL(sss->locale), + a1p, len1, + a2p, len2, + &status); if (U_FAILURE(status)) ereport(ERROR, - (errmsg("collation failed: %s", u_errorName(status)))); + (errmsg("collation failed: %s", + PG_ICU_LIB(sss->locale)->errorName(status)))); } else #endif @@ -2407,9 +2409,9 @@ varstrfastcmp_locale(char *a1p, int len1, char *a2p, int len2, SortSupport ssup) ulen1 = icu_to_uchar(&uchar1, a1p, len1); ulen2 = icu_to_uchar(&uchar2, a2p, len2); - result = ucol_strcoll(sss->locale->info.icu.ucol, - uchar1, ulen1, - uchar2, ulen2); + result = PG_ICU_LIB(sss->locale)->strcoll(PG_ICU_COL(sss->locale), + uchar1, ulen1, + uchar2, ulen2); pfree(uchar1); pfree(uchar2); @@ -2569,24 +2571,24 @@ varstr_abbrev_convert(Datum original, SortSupport ssup) uint32_t state[2]; UErrorCode status; - uiter_setUTF8(&iter, sss->buf1, len); + PG_ICU_LIB(sss->locale)->setUTF8(&iter, sss->buf1, len); state[0] = state[1] = 0; /* won't need that again */ status = U_ZERO_ERROR; - bsize = ucol_nextSortKeyPart(sss->locale->info.icu.ucol, - &iter, - state, - (uint8_t *) sss->buf2, - Min(sizeof(Datum), sss->buflen2), - &status); + bsize = PG_ICU_LIB(sss->locale)->nextSortKeyPart(PG_ICU_COL(sss->locale), + &iter, + state, + (uint8_t *) sss->buf2, + Min(sizeof(Datum), sss->buflen2), + &status); if (U_FAILURE(status)) ereport(ERROR, (errmsg("sort key generation failed: %s", - u_errorName(status)))); + PG_ICU_LIB(sss->locale)->errorName(status)))); } else - bsize = ucol_getSortKey(sss->locale->info.icu.ucol, - uchar, ulen, - (uint8_t *) sss->buf2, sss->buflen2); + bsize = PG_ICU_LIB(sss->locale)->getSortKey(PG_ICU_COL(sss->locale), + uchar, ulen, + (uint8_t *) sss->buf2, sss->buflen2); } else #endif diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c index a990c833c5..d18aa7a3df 100644 --- a/src/backend/utils/init/postinit.c +++ b/src/backend/utils/init/postinit.c @@ -449,26 +449,52 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect { char *actual_versionstr; char *collversionstr; + char *locale; collversionstr = TextDatumGetCString(datum); + locale = dbform->datlocprovider == COLLPROVIDER_ICU ? iculocale : collate; - actual_versionstr = get_collation_actual_version(dbform->datlocprovider, dbform->datlocprovider == COLLPROVIDER_ICU ? iculocale : collate); + actual_versionstr = get_collation_actual_version(dbform->datlocprovider, locale); if (!actual_versionstr) /* should not happen */ elog(WARNING, "database \"%s\" has no actual collation version, but a version was recorded", name); else if (strcmp(actual_versionstr, collversionstr) != 0) - ereport(WARNING, - (errmsg("database \"%s\" has a collation version mismatch", - name), - errdetail("The database was created using collation version %s, " - "but the operating system provides version %s.", - collversionstr, actual_versionstr), - errhint("Rebuild all objects in this database that use the default collation and run " - "ALTER DATABASE %s REFRESH COLLATION VERSION, " - "or build PostgreSQL with the right library version.", - quote_identifier(name)))); + { + if (dbform->datlocprovider == COLLPROVIDER_ICU) + { + ereport(WARNING, + (errmsg("database \"%s\" has a collation version mismatch", + name), + errdetail("The database was created using collation version %s, " + "but the ICU library provides version %s.", + collversionstr, actual_versionstr), + strchr(locale, ':') != NULL ? + errhint("Rebuild all objects in this database that use the default collation and run " + "ALTER DATABASE %s REFRESH COLLATION VERSION, " + "or build PostgreSQL with the right library version.", + quote_identifier(name)) : + errhint("Install another version of ICU and select it using default_icu_library_verison, or " + "rebuild all objects in this database that use the default collation and run " + "ALTER DATABASE %s REFRESH COLLATION VERSION, " + "or build PostgreSQL with the right library version.", + quote_identifier(name)))); + } + else + { + ereport(WARNING, + (errmsg("database \"%s\" has a collation version mismatch", + name), + errdetail("The database was created using collation version %s, " + "but the operating system provides version %s.", + collversionstr, actual_versionstr), + errhint("Rebuild all objects in this database that use the default collation and run " + "ALTER DATABASE %s REFRESH COLLATION VERSION, " + "or build PostgreSQL with the right library version.", + quote_identifier(name)))); + } + } } /* Make the locale settings visible as GUC variables, too */ diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c index 836b49484a..25e905ce8b 100644 --- a/src/backend/utils/misc/guc_tables.c +++ b/src/backend/utils/misc/guc_tables.c @@ -2937,6 +2937,20 @@ struct config_int ConfigureNamesInt[] = check_max_worker_processes, NULL, NULL }, + { + {"default_icu_library_version", + PGC_SUSET, + COMPAT_OPTIONS_PREVIOUS, + gettext_noop("Default major version of ICU library to use for collations if not specified."), + NULL + }, + &default_icu_library_version, + 0, + 0, + 1000, + NULL, NULL, NULL + }, + { {"max_logical_replication_workers", PGC_POSTMASTER, @@ -3920,6 +3934,20 @@ struct config_string ConfigureNamesString[] = NULL, NULL, NULL }, + { + {"icu_library_path", PGC_SUSET, COMPAT_OPTIONS_PREVIOUS, + gettext_noop("Sets the path for dynamically loadable ICU libraries."), + gettext_noop("If versions of ICU other than the one that " + "PostgreSQL is linked against are needed, they will " + "be opened from this directory. If empty, the " + "system linker search path will be used."), + GUC_SUPERUSER_ONLY + }, + &icu_library_path, + "", + NULL, NULL, NULL + }, + { {"krb_server_keyfile", PGC_SIGHUP, CONN_AUTH_AUTH, gettext_noop("Sets the location of the Kerberos server key file."), diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 868d21c351..2713c92124 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -727,6 +727,11 @@ #lc_numeric = 'C' # locale for number formatting #lc_time = 'C' # locale for time formatting +#default_icu_library_version = 0 # default major version of ICU library + # (0 for the linked version) +#icu_library_path = '' # path for dynamically loaded ICU + # libraries + # default configuration for text search #default_text_search_config = 'pg_catalog.simple' diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 9dbe9ec801..a76eb6c94b 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -11707,6 +11707,9 @@ proname => 'pg_database_collation_actual_version', procost => '100', provolatile => 'v', prorettype => 'text', proargtypes => 'oid', prosrc => 'pg_database_collation_actual_version' }, +{ oid => '8888', descr => 'get ICU library version string', + proname => 'pg_icu_library_version', provolatile => 'v', prorettype => 'text', + proargtypes => 'int4', prosrc => 'pg_icu_library_version' }, # system management/monitoring related functions { oid => '3353', descr => 'list files in the log directory', diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h index a875942123..c52fe0c7df 100644 --- a/src/include/utils/pg_locale.h +++ b/src/include/utils/pg_locale.h @@ -17,6 +17,7 @@ #endif #ifdef USE_ICU #include <unicode/ucol.h> +#include <unicode/ubrk.h> #endif #ifdef USE_ICU @@ -40,6 +41,8 @@ extern PGDLLIMPORT char *locale_messages; extern PGDLLIMPORT char *locale_monetary; extern PGDLLIMPORT char *locale_numeric; extern PGDLLIMPORT char *locale_time; +extern PGDLLIMPORT char *icu_library_path; +extern PGDLLIMPORT int default_icu_library_version; /* lc_time localization cache */ extern PGDLLIMPORT char *localized_abbrev_days[]; @@ -63,6 +66,72 @@ extern struct lconv *PGLC_localeconv(void); extern void cache_locale_time(void); +#ifdef USE_ICU + +/* + * An ICU library version that we're either linked against or have loaded at + * runtime. + */ +typedef struct pg_icu_library +{ + int major_version; + void *libicui18n_handle; + void *libicuuc_handle; + void (*getLibraryVersion) (UVersionInfo info); + UCollator *(*open) (const char *loc, UErrorCode *status); + void (*close) (UCollator *coll); + void (*getVersion) (const UCollator *coll, UVersionInfo info); + void (*versionToString) (const UVersionInfo versionArray, + char *versionString); + UCollationResult(*strcoll) (const UCollator *coll, + const UChar *source, + int32_t sourceLength, + const UChar *target, + int32_t targetLength); + UCollationResult(*strcollUTF8) (const UCollator *coll, + const char *source, + int32_t sourceLength, + const char *target, + int32_t targetLength, + UErrorCode *status); + int32_t (*getSortKey) (const UCollator *coll, + const UChar *source, + int32_t sourceLength, + uint8_t *result, + int32_t resultLength); + int32_t (*nextSortKeyPart) (const UCollator *coll, + UCharIterator *iter, + uint32_t state[2], + uint8_t *dest, + int32_t count, + UErrorCode *status); + void (*setUTF8) (UCharIterator *iter, + const char *s, + int32_t length); + const char *(*errorName) (UErrorCode code); + int32_t (*strToUpper) (UChar *dest, + int32_t destCapacity, + const UChar *src, + int32_t srcLength, + const char *locale, + UErrorCode *pErrorCode); + int32_t (*strToLower) (UChar *dest, + int32_t destCapacity, + const UChar *src, + int32_t srcLength, + const char *locale, + UErrorCode *pErrorCode); + int32_t (*strToTitle) (UChar *dest, + int32_t destCapacity, + const UChar *src, + int32_t srcLength, + UBreakIterator *titleIter, + const char *locale, + UErrorCode *pErrorCode); + struct pg_icu_library *next; +} pg_icu_library; + +#endif /* * We define our own wrapper around locale_t so we can keep the same @@ -84,12 +153,18 @@ struct pg_locale_struct { const char *locale; UCollator *ucol; + pg_icu_library *lib; } icu; #endif int dummy; /* in case we have neither LOCALE_T nor ICU */ } info; }; +#ifdef USE_ICU +#define PG_ICU_LIB(x) ((x)->info.icu.lib) +#define PG_ICU_COL(x) ((x)->info.icu.ucol) +#endif + typedef struct pg_locale_struct *pg_locale_t; extern PGDLLIMPORT struct pg_locale_struct default_locale; diff --git a/src/test/icu/meson.build b/src/test/icu/meson.build index 5a4f53f37f..ac2672190e 100644 --- a/src/test/icu/meson.build +++ b/src/test/icu/meson.build @@ -5,6 +5,7 @@ tests += { 'tap': { 'tests': [ 't/010_database.pl', + 't/020_multiversion.pl', ], 'env': {'with_icu': icu.found() ? 'yes' : 'no'}, }, diff --git a/src/test/icu/t/020_multiversion.pl b/src/test/icu/t/020_multiversion.pl new file mode 100644 index 0000000000..52408deb59 --- /dev/null +++ b/src/test/icu/t/020_multiversion.pl @@ -0,0 +1,203 @@ +# Copyright (c) 2022, PostgreSQL Global Development Group + +# This test requires a second major version of ICU installed in the usual +# system library search path. That is, not the one PostgreSQL was linked +# against. It also assumes that ucol_getVersion() for locale "en" will change +# between the two library versions. + +use strict; +use warnings; +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More; + +if ($ENV{with_icu} ne 'yes') +{ + plan skip_all => 'ICU not supported by this build'; +} + +if (!($ENV{PG_TEST_EXTRA} =~ /\bicu=([0-9]+)\b/)) +{ + plan skip_all => 'PG_TEST_EXTRA not configured to test an alternative ICU library version'; +} +my $alt_major_version = $1; + +my $node1 = PostgreSQL::Test::Cluster->new('node1'); +$node1->init; +$node1->start; + +my $linked_major_version = $node1->safe_psql('postgres', 'select pg_icu_library_version(-1)::decimal::int'); + +print "linked_major_version = $linked_major_version\n"; +print "alt_major_version = $alt_major_version\n"; + +if ($alt_major_version ge $linked_major_version) +{ + BAIL_OUT("can't run multi-version tests because ICU major version selected via PG_TEST_EXTRA is not lower than the major version the executable is linked against ($linked_major_version)"); +} + +# Sanity check that when we load a library, its u_getVersion() function tells +# us it has the major version we expect. The result is a string eg "71.1", so +# we get the major part by casting. +is($node1->safe_psql('postgres', "select pg_icu_library_version($alt_major_version)::decimal::int"), + $alt_major_version, + "alt library reports expected major version"); + +sub set_default_icu_library_version +{ + my $major_version = shift; + $node1->safe_psql('postgres', "alter system set default_icu_library_version = $major_version; select pg_reload_conf()"); +} + +my $ret; +my $stderr; + +# Create a collation that doesn't specify the ICU version to use. Which +# library we load depends on the GUC default_icu_library_version. Here it uses +# the linked version because it's set to 0 (default value in a new cluster). +set_default_icu_library_version(0); +$node1->safe_psql('postgres', "create collation c1 (provider=icu, locale='en')"); + +# No warning by default. +$ret = $node1->psql('postgres', "select 'x' < 'y' collate c1", stderr => \$stderr); +is($ret, 0, "can use collation"); +unlike($stderr, qr/WARNING/, "no warning for default"); + +# No warning if we explicitly select the linked version. +set_default_icu_library_version($linked_major_version); +$ret = $node1->psql('postgres', "select 'x' < 'y' collate c1", stderr => \$stderr); +unlike($stderr, qr/WARNING/, "no warning for explicit match"); + +# If we use a different major version explicitly, we get a warning that +# includes a hint that we might be able to install and select a different ICU +# version. +set_default_icu_library_version($alt_major_version); +$ret = $node1->psql('postgres', "select 'x' < 'y' collate c1", stderr => \$stderr); +is($ret, 0, "success"); +like($stderr, qr/WARNING/, "warning for incorrect major version"); +like($stderr, qr/HINT: Install another version of ICU/, "warning suggests installing another ICU version"); + +# Create a collation using the alt version without specifying it explicitly. +# This simulates a collation that was created by a different build linked +# against an older ICU. +$node1->safe_psql('postgres', "create collation c2 (provider=icu, locale='en')"); + +# Warning if we try to use it with default setttings. +set_default_icu_library_version(0); +$ret = $node1->psql('postgres', "select 'x' < 'y' collate c2", stderr => \$stderr); +is($ret, 0, "success"); +like($stderr, qr/WARNING/, "warning for incorrect major version"); +like($stderr, qr/HINT: Install another version of ICU/, "warning suggests installing another ICU version"); + +# No warning if we explicitly activate the alt version. +set_default_icu_library_version($alt_major_version); +$ret = $node1->psql('postgres', "select 'x' < 'y' collate c2", stderr => \$stderr); +is($ret, 0, "success"); +unlike($stderr, qr/WARNING/, "no warning for explicit match"); + +# Refresh the version... this will update it from the linked version (or +# whatever default_icu_library_version points to, here it's 0 and thus the +# linked version), because c2 is not explicitly pinned to an ICU major version. +set_default_icu_library_version(0); +$ret = $node1->psql('postgres', "alter collation c2 refresh version", stderr => \$stderr); +is($ret, 0, "success"); +like($stderr, qr/NOTICE: changing version/, "version changes"); + +# Now no warning. +$ret = $node1->psql('postgres', "select 'x' < 'y' collate c2", stderr => \$stderr); +is($ret, 0, "success"); +unlike($stderr, qr/WARNING/, "warning has gone away after refresh"); + +# Create a collation that is pinned to a specific version of ICU. +$node1->safe_psql('postgres', "create collation c3 (provider=icu, locale='$alt_major_version:en')"); + +# No warnings expected, no matter what default_icu_library_version says, because +# we always load that exact library. +set_default_icu_library_version($linked_major_version); +$ret = $node1->psql('postgres', "select 'x' < 'y' collate c3", stderr => \$stderr); +is($ret, 0, "success"); +unlike($stderr, qr/WARNING/, "no warning for explicit lib"); +set_default_icu_library_version($alt_major_version); +$ret = $node1->psql('postgres', "select 'x' < 'y' collate c3", stderr => \$stderr); +is($ret, 0, "success"); +unlike($stderr, qr/WARNING/, "no warning for explicit lib"); +set_default_icu_library_version(0); +$ret = $node1->psql('postgres', "select 'x' < 'y' collate c3", stderr => \$stderr); +is($ret, 0, "success"); +unlike($stderr, qr/WARNING/, "no warning for explicit lib"); + +# Similar tests using the database default. + +set_default_icu_library_version(0); +$node1->safe_psql('postgres', "create database db2 locale_provider = icu template = template0 icu_locale = 'en'"); + +# No warning. +$ret = $node1->psql('db2', "select", stderr => \$stderr); +is($ret, 0, "success"); +unlike($stderr, qr/WARNING/, "no warning"); + +# Warning when you log into the database. +set_default_icu_library_version($alt_major_version); +$ret = $node1->psql('db2', "select", stderr => \$stderr); +is($ret, 0, "success"); +like($stderr, qr/WARNING/, "warning for incorrect major version"); +like($stderr, qr/HINT: Install another version of ICU/, "warning suggests installing another ICU version"); + +# One way to clear the warning is to REFRESH. +$ret = $node1->psql('postgres', "alter database db2 refresh collation version", stderr => \$stderr); +is($ret, 0, "success"); +like($stderr, qr/NOTICE: changing version/, "version changes"); + +# Now the warning is gone. +$ret = $node1->psql('db2', "select", stderr => \$stderr); +is($ret, 0, "success"); +unlike($stderr, qr/WARNING/, "no warning"); + +# Now we go back to using the linked version, and we'll see the warning again. +# Perhaps this case simulates the most likely real-world experience, when +# moving to a new OS that has PostgreSQL packages linked against a later ICU +# version, using all defaults. +set_default_icu_library_version(0); +$ret = $node1->psql('db2', "select", stderr => \$stderr); +is($ret, 0, "success"); +like($stderr, qr/WARNING/, "warning for incorrect major version"); +like($stderr, qr/HINT: Install another version of ICU/, "warning suggests installing another ICU version"); + +# Option 1 is to get rid of the warning by installing the library and setting +# the GUC. +set_default_icu_library_version($alt_major_version); +$ret = $node1->psql('db2', "select", stderr => \$stderr); +is($ret, 0, "success"); +unlike($stderr, qr/WARNING/, "no warning after setting GUC"); + +# Option 2 is to rebuild indexes etc and use REFRESH. +set_default_icu_library_version(0); +$ret = $node1->psql('postgres', "alter database db2 refresh collation version", stderr => \$stderr); +is($ret, 0, "success"); +like($stderr, qr/NOTICE: changing version/, "version changes"); +$ret = $node1->psql('db2', "select", stderr => \$stderr); +is($ret, 0, "success"); +unlike($stderr, qr/WARNING/, "no warning after refresh"); + +# None of this applies if you explicitly pinned your database to an specific +# ICU major version in the first place, so we ignore the GUC. +set_default_icu_library_version(0); +$node1->safe_psql('postgres', "create database db3 locale_provider = icu template = template0 icu_locale = '$alt_major_version:en'"); + +# No warning with all GUC settings. +set_default_icu_library_version($alt_major_version); +$ret = $node1->psql('db3', "select", stderr => \$stderr); +is($ret, 0, "success"); +unlike($stderr, qr/WARNING/, "no warning with pinned library version"); +set_default_icu_library_version($linked_major_version); +$ret = $node1->psql('db3', "select", stderr => \$stderr); +is($ret, 0, "success"); +unlike($stderr, qr/WARNING/, "no warning with pinned library version"); +set_default_icu_library_version(0); +$ret = $node1->psql('db3', "select", stderr => \$stderr); +is($ret, 0, "success"); +unlike($stderr, qr/WARNING/, "no warning with pinned library version"); + +$node1->stop; + +done_testing(); diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index f8302f1ed1..12c11f1586 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -1101,6 +1101,7 @@ HeapTupleTableSlot HistControl HotStandbyState I32 +ICU_Convert_BI_Func ICU_Convert_Func ID INFIX @@ -2854,6 +2855,7 @@ TypeName U U32 U8 +UBreakIterator UChar UCharIterator UColAttribute @@ -3484,6 +3486,7 @@ pg_funcptr_t pg_gssinfo pg_hmac_ctx pg_hmac_errno +pg_icu_library pg_int64 pg_local_to_utf_combined pg_locale_t -- 2.30.2