There were a few inquiries about this topic recently, so I dug up the
old thread and patch. What we got stuck on last time was that we can't
just swap out all locale support in a database for ICU. We still need
to set the usual locale environment, otherwise some things that are not
ICU aware will break or degrade. I had initially anticipated fixing
that by converting everything that uses libc locales to ICU. But that
turned out to be tedious and ultimately not very useful as far as the
user-facing result is concerned, so I gave up.
So this is a different approach: If you choose ICU as the default locale
for a database, you still need to specify lc_ctype and lc_collate
settings, as before. Unlike in the previous patch, where the ICU
collation name was written in datcollate, there is now a third column
(daticucoll), so we can store all three values. This fixes the
described problem. Other than that, once you get all the initial
settings right, it basically just works: The places that have ICU
support now will use a database-wide ICU collation if appropriate, the
places that don't have ICU support continue to use the global libc
locale settings.
I changed the datcollate, datctype, and the new daticucoll fields to
type text (from name). That way, the daticucoll field can be set to
null if it's not applicable. Also, the limit of 63 characters can
actually be a problem if you want to use some combination of the options
that ICU locales offer. And for less extreme uses, having
variable-length fields will save some storage, since typical locale
names are much shorter.
For the same reasons and to keep things consistent, I also changed the
analogous pg_collation fields like that. This also removes some weird
code that has to check that colcollate and colctype have to be the same
for ICU, so it's overall cleaner.From 4eb9fbac238c1abf481fa43431ecc22e782a5290 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <pe...@eisentraut.org>
Date: Thu, 30 Dec 2021 12:47:24 +0100
Subject: [PATCH v3] Add option to use ICU as global collation provider
This adds the option to use ICU as the default collation provider for
either the whole cluster or a database. New options for initdb,
createdb, and CREATE DATABASE are used to select this.
Discussion:
https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a%402ndquadrant.com
---
doc/src/sgml/catalogs.sgml | 13 +-
doc/src/sgml/ref/create_database.sgml | 16 ++
doc/src/sgml/ref/createdb.sgml | 9 +
doc/src/sgml/ref/initdb.sgml | 23 ++
src/backend/access/hash/hashfunc.c | 18 +-
src/backend/catalog/pg_collation.c | 24 +-
src/backend/commands/collationcmds.c | 120 +++++++---
src/backend/commands/dbcommands.c | 93 ++++++--
src/backend/regex/regc_pg_locale.c | 7 +-
src/backend/utils/adt/formatting.c | 6 +
src/backend/utils/adt/like.c | 20 +-
src/backend/utils/adt/like_support.c | 2 +
src/backend/utils/adt/pg_locale.c | 223 +++++++++++-------
src/backend/utils/adt/varchar.c | 22 +-
src/backend/utils/adt/varlena.c | 26 +-
src/backend/utils/init/postinit.c | 37 ++-
src/bin/initdb/Makefile | 2 +
src/bin/initdb/initdb.c | 62 ++++-
src/bin/initdb/t/001_initdb.pl | 18 +-
src/bin/pg_dump/pg_dump.c | 16 ++
src/bin/psql/describe.c | 23 +-
src/bin/psql/tab-complete.c | 2 +-
src/bin/scripts/Makefile | 2 +
src/bin/scripts/createdb.c | 9 +
src/bin/scripts/t/020_createdb.pl | 20 +-
src/include/catalog/pg_collation.dat | 3 +-
src/include/catalog/pg_collation.h | 6 +-
src/include/catalog/pg_database.dat | 2 +-
src/include/catalog/pg_database.h | 16 +-
src/include/utils/pg_locale.h | 6 +
.../regress/expected/collate.icu.utf8.out | 10 +-
src/test/regress/sql/collate.icu.utf8.sql | 8 +-
32 files changed, 665 insertions(+), 199 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 03e2537b07..89e7279030 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2368,7 +2368,7 @@ <title><structname>pg_collation</structname>
Columns</title>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>collcollate</structfield> <type>name</type>
+ <structfield>collcollate</structfield> <type>text</type>
</para>
<para>
<symbol>LC_COLLATE</symbol> for this collation object
@@ -2377,13 +2377,22 @@ <title><structname>pg_collation</structname>
Columns</title>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>collctype</structfield> <type>name</type>
+ <structfield>collctype</structfield> <type>text</type>
</para>
<para>
<symbol>LC_CTYPE</symbol> for this collation object
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>collicucoll</structfield> <type>text</type>
+ </para>
+ <para>
+ ICU collation string
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>collversion</structfield> <type>text</type>
diff --git a/doc/src/sgml/ref/create_database.sgml
b/doc/src/sgml/ref/create_database.sgml
index 41cb4068ec..7374a9fad5 100644
--- a/doc/src/sgml/ref/create_database.sgml
+++ b/doc/src/sgml/ref/create_database.sgml
@@ -28,6 +28,7 @@
[ LOCALE [=] <replaceable class="parameter">locale</replaceable> ]
[ LC_COLLATE [=] <replaceable
class="parameter">lc_collate</replaceable> ]
[ LC_CTYPE [=] <replaceable
class="parameter">lc_ctype</replaceable> ]
+ [ COLLATION_PROVIDER [=] <replaceable
class="parameter">collation_provider</replaceable> ]
[ TABLESPACE [=] <replaceable
class="parameter">tablespace_name</replaceable> ]
[ ALLOW_CONNECTIONS [=] <replaceable
class="parameter">allowconn</replaceable> ]
[ CONNECTION LIMIT [=] <replaceable
class="parameter">connlimit</replaceable> ]
@@ -157,6 +158,21 @@ <title>Parameters</title>
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><replaceable>collation_provider</replaceable></term>
+
+ <listitem>
+ <para>
+ Specifies the provider to use for the default collation in this
+ database. Possible values are:
+ <literal>icu</literal>,<indexterm><primary>ICU</primary></indexterm>
+ <literal>libc</literal>. <literal>libc</literal> is the default. The
+ available choices depend on the operating system and build options.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable class="parameter">tablespace_name</replaceable></term>
<listitem>
diff --git a/doc/src/sgml/ref/createdb.sgml b/doc/src/sgml/ref/createdb.sgml
index 86473455c9..4b07363fcc 100644
--- a/doc/src/sgml/ref/createdb.sgml
+++ b/doc/src/sgml/ref/createdb.sgml
@@ -83,6 +83,15 @@ <title>Options</title>
</listitem>
</varlistentry>
+ <varlistentry>
+
<term><option>--collation-provider={<literal>libc</literal>|<literal>icu</literal>}</option></term>
+ <listitem>
+ <para>
+ Specifies the collation provider for the database's default collation.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>-D <replaceable
class="parameter">tablespace</replaceable></option></term>
<term><option>--tablespace=<replaceable
class="parameter">tablespace</replaceable></option></term>
diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index 8f71c7c962..77618d9a7a 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -166,6 +166,18 @@ <title>Options</title>
</listitem>
</varlistentry>
+ <varlistentry>
+
<term><option>--collation-provider={<literal>libc</literal>|<literal>icu</literal>}</option></term>
+ <listitem>
+ <para>
+ This option sets the collation provider for databases created in the
+ new cluster. It can be overridden in the <command>CREATE
+ DATABASE</command> command when new databases are subsequently
+ created. The default is <literal>libc</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>-D <replaceable
class="parameter">directory</replaceable></option></term>
<term><option>--pgdata=<replaceable
class="parameter">directory</replaceable></option></term>
@@ -210,6 +222,17 @@ <title>Options</title>
</listitem>
</varlistentry>
+ <varlistentry>
+
<term><option>--icu-locale=<replaceable>locale</replaceable></option></term>
+ <listitem>
+ <para>
+ Specifies the ICU locale if the ICU collation provider is used. If
+ this is not specified, the value from the <option>--locale</option>
+ option is used.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="app-initdb-data-checksums" xreflabel="data checksums">
<term><option>-k</option></term>
<term><option>--data-checksums</option></term>
diff --git a/src/backend/access/hash/hashfunc.c
b/src/backend/access/hash/hashfunc.c
index 242333920e..6c29816193 100644
--- a/src/backend/access/hash/hashfunc.c
+++ b/src/backend/access/hash/hashfunc.c
@@ -278,8 +278,13 @@ hashtext(PG_FUNCTION_ARGS)
errmsg("could not determine which collation to
use for string hashing"),
errhint("Use the COLLATE clause to set the
collation explicitly.")));
- if (!lc_collate_is_c(collid) && collid != DEFAULT_COLLATION_OID)
- mylocale = pg_newlocale_from_collation(collid);
+ if (!lc_collate_is_c(collid))
+ {
+ if (collid != DEFAULT_COLLATION_OID)
+ mylocale = pg_newlocale_from_collation(collid);
+ else if (default_locale.provider == COLLPROVIDER_ICU)
+ mylocale = &default_locale;
+ }
if (!mylocale || mylocale->deterministic)
{
@@ -334,8 +339,13 @@ hashtextextended(PG_FUNCTION_ARGS)
errmsg("could not determine which collation to
use for string hashing"),
errhint("Use the COLLATE clause to set the
collation explicitly.")));
- if (!lc_collate_is_c(collid) && collid != DEFAULT_COLLATION_OID)
- mylocale = pg_newlocale_from_collation(collid);
+ if (!lc_collate_is_c(collid))
+ {
+ if (collid != DEFAULT_COLLATION_OID)
+ mylocale = pg_newlocale_from_collation(collid);
+ else if (default_locale.provider == COLLPROVIDER_ICU)
+ mylocale = &default_locale;
+ }
if (!mylocale || mylocale->deterministic)
{
diff --git a/src/backend/catalog/pg_collation.c
b/src/backend/catalog/pg_collation.c
index 19068b652a..c3365e99c3 100644
--- a/src/backend/catalog/pg_collation.c
+++ b/src/backend/catalog/pg_collation.c
@@ -49,6 +49,7 @@ CollationCreate(const char *collname, Oid collnamespace,
bool collisdeterministic,
int32 collencoding,
const char *collcollate, const char *collctype,
+ const char *collicucoll,
const char *collversion,
bool if_not_exists,
bool quiet)
@@ -58,9 +59,7 @@ CollationCreate(const char *collname, Oid collnamespace,
HeapTuple tup;
Datum values[Natts_pg_collation];
bool nulls[Natts_pg_collation];
- NameData name_name,
- name_collate,
- name_ctype;
+ NameData name_name;
Oid oid;
ObjectAddress myself,
referenced;
@@ -68,8 +67,7 @@ CollationCreate(const char *collname, Oid collnamespace,
AssertArg(collname);
AssertArg(collnamespace);
AssertArg(collowner);
- AssertArg(collcollate);
- AssertArg(collctype);
+ AssertArg((collcollate && collctype) || collicucoll);
/*
* Make sure there is no existing collation of same name & encoding.
@@ -163,10 +161,18 @@ CollationCreate(const char *collname, Oid collnamespace,
values[Anum_pg_collation_collprovider - 1] = CharGetDatum(collprovider);
values[Anum_pg_collation_collisdeterministic - 1] =
BoolGetDatum(collisdeterministic);
values[Anum_pg_collation_collencoding - 1] =
Int32GetDatum(collencoding);
- namestrcpy(&name_collate, collcollate);
- values[Anum_pg_collation_collcollate - 1] = NameGetDatum(&name_collate);
- namestrcpy(&name_ctype, collctype);
- values[Anum_pg_collation_collctype - 1] = NameGetDatum(&name_ctype);
+ if (collcollate)
+ values[Anum_pg_collation_collcollate - 1] =
CStringGetTextDatum(collcollate);
+ else
+ nulls[Anum_pg_collation_collcollate - 1] = true;
+ if (collctype)
+ values[Anum_pg_collation_collctype - 1] =
CStringGetTextDatum(collctype);
+ else
+ nulls[Anum_pg_collation_collctype - 1] = true;
+ if (collicucoll)
+ values[Anum_pg_collation_collicucoll - 1] =
CStringGetTextDatum(collicucoll);
+ else
+ nulls[Anum_pg_collation_collicucoll - 1] = true;
if (collversion)
values[Anum_pg_collation_collversion - 1] =
CStringGetTextDatum(collversion);
else
diff --git a/src/backend/commands/collationcmds.c
b/src/backend/commands/collationcmds.c
index 53fc579f37..7dd125705b 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -65,6 +65,7 @@ DefineCollation(ParseState *pstate, List *names, List
*parameters, bool if_not_e
DefElem *versionEl = NULL;
char *collcollate = NULL;
char *collctype = NULL;
+ char *collicucoll = NULL;
char *collproviderstr = NULL;
bool collisdeterministic = true;
int collencoding = 0;
@@ -129,18 +130,36 @@ DefineCollation(ParseState *pstate, List *names, List
*parameters, bool if_not_e
{
Oid collid;
HeapTuple tp;
+ Datum datum;
+ bool isnull;
collid = get_collation_oid(defGetQualifiedName(fromEl), false);
tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collid));
if (!HeapTupleIsValid(tp))
elog(ERROR, "cache lookup failed for collation %u",
collid);
- collcollate = pstrdup(NameStr(((Form_pg_collation)
GETSTRUCT(tp))->collcollate));
- collctype = pstrdup(NameStr(((Form_pg_collation)
GETSTRUCT(tp))->collctype));
collprovider = ((Form_pg_collation)
GETSTRUCT(tp))->collprovider;
collisdeterministic = ((Form_pg_collation)
GETSTRUCT(tp))->collisdeterministic;
collencoding = ((Form_pg_collation)
GETSTRUCT(tp))->collencoding;
+ datum = SysCacheGetAttr(COLLOID, tp,
Anum_pg_collation_collcollate, &isnull);
+ if (!isnull)
+ collcollate = TextDatumGetCString(datum);
+ else
+ collcollate = NULL;
+
+ datum = SysCacheGetAttr(COLLOID, tp,
Anum_pg_collation_collctype, &isnull);
+ if (!isnull)
+ collctype = TextDatumGetCString(datum);
+ else
+ collctype = NULL;
+
+ datum = SysCacheGetAttr(COLLOID, tp,
Anum_pg_collation_collicucoll, &isnull);
+ if (!isnull)
+ collicucoll = TextDatumGetCString(datum);
+ else
+ collicucoll = NULL;
+
ReleaseSysCache(tp);
/*
@@ -156,18 +175,6 @@ DefineCollation(ParseState *pstate, List *names, List
*parameters, bool if_not_e
errmsg("collation \"default\" cannot
be copied")));
}
- if (localeEl)
- {
- collcollate = defGetString(localeEl);
- collctype = defGetString(localeEl);
- }
-
- if (lccollateEl)
- collcollate = defGetString(lccollateEl);
-
- if (lcctypeEl)
- collctype = defGetString(lcctypeEl);
-
if (providerEl)
collproviderstr = defGetString(providerEl);
@@ -192,15 +199,43 @@ DefineCollation(ParseState *pstate, List *names, List
*parameters, bool if_not_e
else if (!fromEl)
collprovider = COLLPROVIDER_LIBC;
- if (!collcollate)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
- errmsg("parameter \"lc_collate\" must be
specified")));
+ if (localeEl)
+ {
+ if (collprovider == COLLPROVIDER_LIBC)
+ {
+ collcollate = defGetString(localeEl);
+ collctype = defGetString(localeEl);
+ }
+ else
+ collicucoll = defGetString(localeEl);
+ }
- if (!collctype)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
- errmsg("parameter \"lc_ctype\" must be
specified")));
+ if (lccollateEl)
+ collcollate = defGetString(lccollateEl);
+
+ if (lcctypeEl)
+ collctype = defGetString(lcctypeEl);
+
+ if (collprovider == COLLPROVIDER_LIBC)
+ {
+ if (!collcollate)
+ ereport(ERROR,
+
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("parameter \"lc_collate\" must
be specified")));
+
+ if (!collctype)
+ ereport(ERROR,
+
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("parameter \"lc_ctype\" must be
specified")));
+ }
+
+ if (collprovider == COLLPROVIDER_ICU)
+ {
+ if (!collicucoll)
+ ereport(ERROR,
+
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("parameter \"locale\" must be
specified")));
+ }
/*
* Nondeterministic collations are currently only supported with ICU
@@ -243,7 +278,7 @@ DefineCollation(ParseState *pstate, List *names, List
*parameters, bool if_not_e
}
if (!collversion)
- collversion = get_collation_actual_version(collprovider,
collcollate);
+ collversion = get_collation_actual_version(collprovider,
collprovider == COLLPROVIDER_ICU ? collicucoll : collcollate);
newoid = CollationCreate(collName,
collNamespace,
@@ -253,6 +288,7 @@ DefineCollation(ParseState *pstate, List *names, List
*parameters, bool if_not_e
collencoding,
collcollate,
collctype,
+ collicucoll,
collversion,
if_not_exists,
false); /* not
quiet */
@@ -336,7 +372,13 @@ AlterCollation(AlterCollationStmt *stmt)
&isnull);
oldversion = isnull ? NULL : TextDatumGetCString(collversion);
- newversion = get_collation_actual_version(collForm->collprovider,
NameStr(collForm->collcollate));
+ {
+ Datum datum;
+
+ datum = SysCacheGetAttr(COLLOID, tup, collForm->collprovider ==
COLLPROVIDER_ICU ? Anum_pg_collation_collicucoll :
Anum_pg_collation_collcollate, &isnull);
+ Assert(!isnull);
+ newversion =
get_collation_actual_version(collForm->collprovider,
TextDatumGetCString(datum));
+ }
/* cannot change from NULL to non-NULL or vice versa */
if ((!oldversion && newversion) || (oldversion && !newversion))
@@ -383,8 +425,9 @@ pg_collation_actual_version(PG_FUNCTION_ARGS)
{
Oid collid = PG_GETARG_OID(0);
HeapTuple tp;
- char *collcollate;
char collprovider;
+ Datum datum;
+ bool isnull;
char *version;
tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collid));
@@ -393,12 +436,19 @@ pg_collation_actual_version(PG_FUNCTION_ARGS)
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("collation with OID %u does not exist",
collid)));
- collcollate = pstrdup(NameStr(((Form_pg_collation)
GETSTRUCT(tp))->collcollate));
collprovider = ((Form_pg_collation) GETSTRUCT(tp))->collprovider;
- ReleaseSysCache(tp);
+ if (collprovider != COLLPROVIDER_DEFAULT)
+ {
+ datum = SysCacheGetAttr(COLLOID, tp, collprovider ==
COLLPROVIDER_ICU ? Anum_pg_collation_collicucoll :
Anum_pg_collation_collcollate, &isnull);
+ Assert(!isnull);
+
+ version = get_collation_actual_version(collprovider,
TextDatumGetCString(datum));
+ }
+ else
+ version = NULL;
- version = get_collation_actual_version(collprovider, collcollate);
+ ReleaseSysCache(tp);
if (version)
PG_RETURN_TEXT_P(cstring_to_text(version));
@@ -623,7 +673,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
*/
collid = CollationCreate(localebuf, nspid, GetUserId(),
COLLPROVIDER_LIBC, true, enc,
-
localebuf, localebuf,
+
localebuf, localebuf, NULL,
get_collation_actual_version(COLLPROVIDER_LIBC, localebuf),
true,
true);
if (OidIsValid(collid))
@@ -684,7 +734,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
collid = CollationCreate(alias, nspid, GetUserId(),
COLLPROVIDER_LIBC, true, enc,
-
locale, locale,
+
locale, locale, NULL,
get_collation_actual_version(COLLPROVIDER_LIBC, locale),
true,
true);
if (OidIsValid(collid))
@@ -725,7 +775,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
const char *name;
char *langtag;
char *icucomment;
- const char *collcollate;
+ const char *icucollstr;
Oid collid;
if (i == -1)
@@ -734,20 +784,20 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
name = uloc_getAvailable(i);
langtag = get_icu_language_tag(name);
- collcollate = U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag :
name;
+ icucollstr = U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag :
name;
/*
* Be paranoid about not allowing any non-ASCII strings
into
* pg_collation
*/
- if (!pg_is_ascii(langtag) || !pg_is_ascii(collcollate))
+ if (!pg_is_ascii(langtag) || !pg_is_ascii(icucollstr))
continue;
collid = CollationCreate(psprintf("%s-x-icu", langtag),
nspid,
GetUserId(),
COLLPROVIDER_ICU, true, -1,
-
collcollate, collcollate,
-
get_collation_actual_version(COLLPROVIDER_ICU, collcollate),
+ NULL,
NULL, icucollstr,
+
get_collation_actual_version(COLLPROVIDER_ICU, icucollstr),
true,
true);
if (OidIsValid(collid))
{
diff --git a/src/backend/commands/dbcommands.c
b/src/backend/commands/dbcommands.c
index 029fab48df..7928790cc9 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -36,6 +36,7 @@
#include "catalog/indexing.h"
#include "catalog/objectaccess.h"
#include "catalog/pg_authid.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_database.h"
#include "catalog/pg_db_role_setting.h"
#include "catalog/pg_subscription.h"
@@ -86,7 +87,8 @@ static bool get_db_info(const char *name, LOCKMODE lockmode,
int *encodingP, bool
*dbIsTemplateP, bool *dbAllowConnP,
Oid *dbLastSysOidP,
TransactionId *dbFrozenXidP,
MultiXactId *dbMinMultiP,
- Oid *dbTablespace, char
**dbCollate, char **dbCtype);
+ Oid *dbTablespace, char
**dbCollate, char **dbCtype, char **dbIcucoll,
+ char *dbCollProvider);
static bool have_createdb_privilege(void);
static void remove_dbtablespaces(Oid db_id);
static bool check_db_file_conflict(Oid db_id);
@@ -106,6 +108,8 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
int src_encoding = -1;
char *src_collate = NULL;
char *src_ctype = NULL;
+ char *src_icucoll = NULL;
+ char src_collprovider;
bool src_istemplate;
bool src_allowconn;
Oid src_lastsysoid = InvalidOid;
@@ -127,6 +131,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
DefElem *dlocale = NULL;
DefElem *dcollate = NULL;
DefElem *dctype = NULL;
+ DefElem *dcollprovider = NULL;
DefElem *distemplate = NULL;
DefElem *dallowconnections = NULL;
DefElem *dconnlimit = NULL;
@@ -135,6 +140,8 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
const char *dbtemplate = NULL;
char *dbcollate = NULL;
char *dbctype = NULL;
+ char *dbicucoll = NULL;
+ char dbcollprovider = '\0';
char *canonname;
int encoding = -1;
bool dbistemplate = false;
@@ -191,6 +198,15 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
errorConflictingDefElem(defel, pstate);
dctype = defel;
}
+ else if (strcmp(defel->defname, "collation_provider") == 0)
+ {
+ if (dcollprovider)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or
redundant options"),
+ parser_errposition(pstate,
defel->location)));
+ dcollprovider = defel;
+ }
else if (strcmp(defel->defname, "is_template") == 0)
{
if (distemplate)
@@ -224,12 +240,6 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
parser_errposition(pstate,
defel->location)));
}
- if (dlocale && (dcollate || dctype))
- ereport(ERROR,
- (errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("conflicting or redundant options"),
- errdetail("LOCALE cannot be specified together
with LC_COLLATE or LC_CTYPE.")));
-
if (downer && downer->arg)
dbowner = defGetString(downer);
if (dtemplate && dtemplate->arg)
@@ -266,11 +276,29 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
{
dbcollate = defGetString(dlocale);
dbctype = defGetString(dlocale);
+ dbicucoll = defGetString(dlocale);
}
if (dcollate && dcollate->arg)
dbcollate = defGetString(dcollate);
if (dctype && dctype->arg)
dbctype = defGetString(dctype);
+ if (dcollprovider && dcollprovider->arg)
+ {
+ char *collproviderstr = defGetString(dcollprovider);
+
+#ifdef USE_ICU
+ if (pg_strcasecmp(collproviderstr, "icu") == 0)
+ dbcollprovider = COLLPROVIDER_ICU;
+ else
+#endif
+ if (pg_strcasecmp(collproviderstr, "libc") == 0)
+ dbcollprovider = COLLPROVIDER_LIBC;
+ else
+ ereport(ERROR,
+
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("unrecognized collation
provider: %s",
+ collproviderstr)));
+ }
if (distemplate && distemplate->arg)
dbistemplate = defGetBoolean(distemplate);
if (dallowconnections && dallowconnections->arg)
@@ -320,7 +348,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
&src_dboid, &src_owner, &src_encoding,
&src_istemplate, &src_allowconn,
&src_lastsysoid,
&src_frozenxid, &src_minmxid,
&src_deftablespace,
- &src_collate, &src_ctype))
+ &src_collate, &src_ctype,
&src_icucoll, &src_collprovider))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_DATABASE),
errmsg("template database \"%s\" does not
exist",
@@ -346,6 +374,10 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
dbcollate = src_collate;
if (dbctype == NULL)
dbctype = src_ctype;
+ if (dbicucoll == NULL)
+ dbicucoll = src_icucoll;
+ if (dbcollprovider == '\0')
+ dbcollprovider = src_collprovider;
/* Some encodings are client only */
if (!PG_VALID_BE_ENCODING(encoding))
@@ -525,10 +557,13 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
DirectFunctionCall1(namein, CStringGetDatum(dbname));
new_record[Anum_pg_database_datdba - 1] = ObjectIdGetDatum(datdba);
new_record[Anum_pg_database_encoding - 1] = Int32GetDatum(encoding);
- new_record[Anum_pg_database_datcollate - 1] =
- DirectFunctionCall1(namein, CStringGetDatum(dbcollate));
- new_record[Anum_pg_database_datctype - 1] =
- DirectFunctionCall1(namein, CStringGetDatum(dbctype));
+ new_record[Anum_pg_database_datcollate - 1] =
CStringGetTextDatum(dbcollate);
+ new_record[Anum_pg_database_datctype - 1] =
CStringGetTextDatum(dbctype);
+ if (dbicucoll)
+ new_record[Anum_pg_database_daticucoll - 1] =
CStringGetTextDatum(dbicucoll);
+ else
+ new_record_nulls[Anum_pg_database_daticucoll] = true;
+ new_record[Anum_pg_database_datcollprovider - 1] =
CharGetDatum(dbcollprovider);
new_record[Anum_pg_database_datistemplate - 1] =
BoolGetDatum(dbistemplate);
new_record[Anum_pg_database_datallowconn - 1] =
BoolGetDatum(dballowconnections);
new_record[Anum_pg_database_datconnlimit - 1] =
Int32GetDatum(dbconnlimit);
@@ -802,7 +837,7 @@ dropdb(const char *dbname, bool missing_ok, bool force)
pgdbrel = table_open(DatabaseRelationId, RowExclusiveLock);
if (!get_db_info(dbname, AccessExclusiveLock, &db_id, NULL, NULL,
- &db_istemplate, NULL, NULL, NULL,
NULL, NULL, NULL, NULL))
+ &db_istemplate, NULL, NULL, NULL,
NULL, NULL, NULL, NULL, NULL, NULL))
{
if (!missing_ok)
{
@@ -1001,7 +1036,7 @@ RenameDatabase(const char *oldname, const char *newname)
rel = table_open(DatabaseRelationId, RowExclusiveLock);
if (!get_db_info(oldname, AccessExclusiveLock, &db_id, NULL, NULL,
- NULL, NULL, NULL, NULL, NULL, NULL,
NULL, NULL))
+ NULL, NULL, NULL, NULL, NULL, NULL,
NULL, NULL, NULL, NULL))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_DATABASE),
errmsg("database \"%s\" does not exist",
oldname)));
@@ -1114,7 +1149,7 @@ movedb(const char *dbname, const char *tblspcname)
pgdbrel = table_open(DatabaseRelationId, RowExclusiveLock);
if (!get_db_info(dbname, AccessExclusiveLock, &db_id, NULL, NULL,
- NULL, NULL, NULL, NULL, NULL,
&src_tblspcoid, NULL, NULL))
+ NULL, NULL, NULL, NULL, NULL,
&src_tblspcoid, NULL, NULL, NULL, NULL))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_DATABASE),
errmsg("database \"%s\" does not exist",
dbname)));
@@ -1759,7 +1794,8 @@ get_db_info(const char *name, LOCKMODE lockmode,
int *encodingP, bool *dbIsTemplateP, bool *dbAllowConnP,
Oid *dbLastSysOidP, TransactionId *dbFrozenXidP,
MultiXactId *dbMinMultiP,
- Oid *dbTablespace, char **dbCollate, char **dbCtype)
+ Oid *dbTablespace, char **dbCollate, char **dbCtype,
char **dbIcucoll,
+ char *dbCollProvider)
{
bool result = false;
Relation relation;
@@ -1824,6 +1860,9 @@ get_db_info(const char *name, LOCKMODE lockmode,
if (strcmp(name, NameStr(dbform->datname)) == 0)
{
+ Datum datum;
+ bool isnull;
+
/* oid of the database */
if (dbIdP)
*dbIdP = dbOid;
@@ -1852,10 +1891,28 @@ get_db_info(const char *name, LOCKMODE lockmode,
if (dbTablespace)
*dbTablespace = dbform->dattablespace;
/* default locale settings for this database */
+ if (dbCollProvider)
+ *dbCollProvider =
dbform->datcollprovider;
if (dbCollate)
- *dbCollate =
pstrdup(NameStr(dbform->datcollate));
+ {
+ datum = SysCacheGetAttr(DATABASEOID,
tuple, Anum_pg_database_datcollate, &isnull);
+ Assert(!isnull);
+ *dbCollate = TextDatumGetCString(datum);
+ }
if (dbCtype)
- *dbCtype =
pstrdup(NameStr(dbform->datctype));
+ {
+ datum = SysCacheGetAttr(DATABASEOID,
tuple, Anum_pg_database_datctype, &isnull);
+ Assert(!isnull);
+ *dbCtype = TextDatumGetCString(datum);
+ }
+ if (dbIcucoll)
+ {
+ datum = SysCacheGetAttr(DATABASEOID,
tuple, Anum_pg_database_daticucoll, &isnull);
+ if (isnull)
+ *dbIcucoll = NULL;
+ else
+ *dbIcucoll =
TextDatumGetCString(datum);
+ }
ReleaseSysCache(tuple);
result = true;
break;
diff --git a/src/backend/regex/regc_pg_locale.c
b/src/backend/regex/regc_pg_locale.c
index bbbd61c604..3fe0f1c386 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -241,7 +241,12 @@ pg_set_regex_collation(Oid collation)
else
{
if (collation == DEFAULT_COLLATION_OID)
- pg_regex_locale = 0;
+ {
+ if (default_locale.provider == COLLPROVIDER_ICU)
+ pg_regex_locale = &default_locale;
+ else
+ pg_regex_locale = 0;
+ }
else if (OidIsValid(collation))
{
/*
diff --git a/src/backend/utils/adt/formatting.c
b/src/backend/utils/adt/formatting.c
index 419469fab5..320047fbd0 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -1666,6 +1666,8 @@ str_tolower(const char *buff, size_t nbytes, Oid collid)
}
mylocale = pg_newlocale_from_collation(collid);
}
+ else if (default_locale.provider == COLLPROVIDER_ICU)
+ mylocale = &default_locale;
#ifdef USE_ICU
if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
@@ -1790,6 +1792,8 @@ str_toupper(const char *buff, size_t nbytes, Oid collid)
}
mylocale = pg_newlocale_from_collation(collid);
}
+ else if (default_locale.provider == COLLPROVIDER_ICU)
+ mylocale = &default_locale;
#ifdef USE_ICU
if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
@@ -1915,6 +1919,8 @@ str_initcap(const char *buff, size_t nbytes, Oid collid)
}
mylocale = pg_newlocale_from_collation(collid);
}
+ else if (default_locale.provider == COLLPROVIDER_ICU)
+ mylocale = &default_locale;
#ifdef USE_ICU
if (mylocale && mylocale->provider == COLLPROVIDER_ICU)
diff --git a/src/backend/utils/adt/like.c b/src/backend/utils/adt/like.c
index eed183cd0d..85a668fa36 100644
--- a/src/backend/utils/adt/like.c
+++ b/src/backend/utils/adt/like.c
@@ -150,9 +150,14 @@ SB_lower_char(unsigned char c, pg_locale_t locale, bool
locale_is_c)
static inline int
GenericMatchText(const char *s, int slen, const char *p, int plen, Oid
collation)
{
- if (collation && !lc_ctype_is_c(collation) && collation !=
DEFAULT_COLLATION_OID)
+ if (collation && !lc_ctype_is_c(collation))
{
- pg_locale_t locale = pg_newlocale_from_collation(collation);
+ pg_locale_t locale = 0;
+
+ if (collation != DEFAULT_COLLATION_OID)
+ locale = pg_newlocale_from_collation(collation);
+ else if (default_locale.provider == COLLPROVIDER_ICU)
+ locale = &default_locale;
if (locale && !locale->deterministic)
ereport(ERROR,
@@ -195,11 +200,14 @@ Generic_Text_IC_like(text *str, text *pat, Oid collation)
}
locale = pg_newlocale_from_collation(collation);
- if (locale && !locale->deterministic)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("nondeterministic collations
are not supported for ILIKE")));
}
+ else if (default_locale.provider == COLLPROVIDER_ICU)
+ locale = &default_locale;
+
+ if (locale && !locale->deterministic)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("nondeterministic collations are not
supported for ILIKE")));
/*
* For efficiency reasons, in the single byte case we don't call lower()
diff --git a/src/backend/utils/adt/like_support.c
b/src/backend/utils/adt/like_support.c
index 988568825e..6b82ad1d43 100644
--- a/src/backend/utils/adt/like_support.c
+++ b/src/backend/utils/adt/like_support.c
@@ -1030,6 +1030,8 @@ like_fixed_prefix(Const *patt_const, bool
case_insensitive, Oid collation,
}
locale = pg_newlocale_from_collation(collation);
}
+ else if (default_locale.provider == COLLPROVIDER_ICU)
+ locale = &default_locale;
}
if (typeid != BYTEAOID)
diff --git a/src/backend/utils/adt/pg_locale.c
b/src/backend/utils/adt/pg_locale.c
index cc2ab95535..8d32bc68d8 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -1289,21 +1289,36 @@ lookup_collation_cache(Oid collation, bool set_flags)
/* Attempt to set the flags */
HeapTuple tp;
Form_pg_collation collform;
- const char *collcollate;
- const char *collctype;
tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collation));
if (!HeapTupleIsValid(tp))
elog(ERROR, "cache lookup failed for collation %u",
collation);
collform = (Form_pg_collation) GETSTRUCT(tp);
- collcollate = NameStr(collform->collcollate);
- collctype = NameStr(collform->collctype);
-
- cache_entry->collate_is_c = ((strcmp(collcollate, "C") == 0) ||
-
(strcmp(collcollate, "POSIX") == 0));
- cache_entry->ctype_is_c = ((strcmp(collctype, "C") == 0) ||
-
(strcmp(collctype, "POSIX") == 0));
+ if (collform->collprovider == COLLPROVIDER_LIBC)
+ {
+ Datum datum;
+ bool isnull;
+ const char *collcollate;
+ const char *collctype;
+
+ datum = SysCacheGetAttr(COLLOID, tp,
Anum_pg_collation_collcollate, &isnull);
+ Assert(!isnull);
+ collcollate = TextDatumGetCString(datum);
+ datum = SysCacheGetAttr(COLLOID, tp,
Anum_pg_collation_collctype, &isnull);
+ Assert(!isnull);
+ collctype = TextDatumGetCString(datum);
+
+ cache_entry->collate_is_c = ((strcmp(collcollate, "C")
== 0) ||
+
(strcmp(collcollate, "POSIX") == 0));
+ cache_entry->ctype_is_c = ((strcmp(collctype, "C") ==
0) ||
+
(strcmp(collctype, "POSIX") == 0));
+ }
+ else
+ {
+ cache_entry->collate_is_c = false;
+ cache_entry->ctype_is_c = false;
+ }
cache_entry->flags_valid = true;
@@ -1336,6 +1351,9 @@ lc_collate_is_c(Oid collation)
static int result = -1;
char *localeptr;
+ if (default_locale.provider == COLLPROVIDER_ICU)
+ return false;
+
if (result >= 0)
return (bool) result;
localeptr = setlocale(LC_COLLATE, NULL);
@@ -1386,6 +1404,9 @@ lc_ctype_is_c(Oid collation)
static int result = -1;
char *localeptr;
+ if (default_locale.provider == COLLPROVIDER_ICU)
+ return false;
+
if (result >= 0)
return (bool) result;
localeptr = setlocale(LC_CTYPE, NULL);
@@ -1414,6 +1435,88 @@ lc_ctype_is_c(Oid collation)
return (lookup_collation_cache(collation, true))->ctype_is_c;
}
+struct pg_locale_struct default_locale;
+
+void
+make_icu_collator(const char *icucollstr,
+ struct pg_locale_struct *resultp)
+{
+#ifdef USE_ICU
+ UCollator *collator;
+ UErrorCode status;
+
+ status = U_ZERO_ERROR;
+ collator = ucol_open(icucollstr, &status);
+ if (U_FAILURE(status))
+ ereport(ERROR,
+ (errmsg("could not open collator for locale
\"%s\": %s",
+ icucollstr,
u_errorName(status))));
+
+ if (U_ICU_VERSION_MAJOR_NUM < 54)
+ icu_set_collation_attributes(collator, icucollstr);
+
+ /* We will leak this string if we get an error below :-( */
+ resultp->info.icu.locale = MemoryContextStrdup(TopMemoryContext,
icucollstr);
+ resultp->info.icu.ucol = collator;
+#else /* not USE_ICU */
+ /* could get here if a collation was created by a build with ICU */
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"), \
+ errhint("You need to rebuild PostgreSQL using %s.",
"--with-icu")));
+#endif /* not USE_ICU */
+}
+
+void
+check_collation_version(HeapTuple colltuple)
+{
+ Form_pg_collation collform;
+ Datum collversion;
+ bool isnull;
+
+ collform = (Form_pg_collation) GETSTRUCT(colltuple);
+
+ collversion = SysCacheGetAttr(COLLOID, colltuple,
Anum_pg_collation_collversion,
+ &isnull);
+ if (!isnull)
+ {
+ char *actual_versionstr;
+ char *collversionstr;
+ Datum datum;
+ bool isnull;
+
+ datum = SysCacheGetAttr(COLLOID, colltuple,
collform->collprovider == COLLPROVIDER_ICU ? Anum_pg_collation_collicucoll :
Anum_pg_collation_collcollate, &isnull);
+ Assert(!isnull);
+
+ actual_versionstr =
get_collation_actual_version(collform->collprovider,
+
TextDatumGetCString(datum));
+ if (!actual_versionstr)
+ {
+ /*
+ * This could happen when specifying a version in CREATE
+ * COLLATION for a libc locale, or manually creating a
mess in
+ * the catalogs.
+ */
+ ereport(ERROR,
+ (errmsg("collation \"%s\" has no actual
version, but a version was specified",
+
NameStr(collform->collname))));
+ }
+ collversionstr = TextDatumGetCString(collversion);
+
+ if (strcmp(actual_versionstr, collversionstr) != 0)
+ ereport(WARNING,
+ (errmsg("collation \"%s\" has version
mismatch",
+
NameStr(collform->collname)),
+ errdetail("The collation in the
database was created using version %s, "
+ "but the operating
system provides version %s.",
+ collversionstr,
actual_versionstr),
+ errhint("Rebuild all objects affected
by this collation and run "
+ "ALTER COLLATION %s
REFRESH VERSION, "
+ "or build PostgreSQL
with the right library version.",
+
quote_qualified_identifier(get_namespace_name(collform->collnamespace),
+
NameStr(collform->collname)))));
+ }
+}
/* simple subroutine for reporting errors from newlocale() */
#ifdef HAVE_LOCALE_T
@@ -1483,21 +1586,14 @@ pg_newlocale_from_collation(Oid collid)
/* We haven't computed this yet in this session, so do it */
HeapTuple tp;
Form_pg_collation collform;
- const char *collcollate;
- const char *collctype pg_attribute_unused();
struct pg_locale_struct result;
pg_locale_t resultp;
- Datum collversion;
- bool isnull;
tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collid));
if (!HeapTupleIsValid(tp))
elog(ERROR, "cache lookup failed for collation %u",
collid);
collform = (Form_pg_collation) GETSTRUCT(tp);
- collcollate = NameStr(collform->collcollate);
- collctype = NameStr(collform->collctype);
-
/* We'll fill in the result struct locally before allocating
memory */
memset(&result, 0, sizeof(result));
result.provider = collform->collprovider;
@@ -1506,8 +1602,19 @@ pg_newlocale_from_collation(Oid collid)
if (collform->collprovider == COLLPROVIDER_LIBC)
{
#ifdef HAVE_LOCALE_T
+ Datum datum;
+ bool isnull;
+ const char *collcollate;
+ const char *collctype pg_attribute_unused();
locale_t loc;
+ datum = SysCacheGetAttr(COLLOID, tp,
Anum_pg_collation_collcollate, &isnull);
+ Assert(!isnull);
+ collcollate = TextDatumGetCString(datum);
+ datum = SysCacheGetAttr(COLLOID, tp,
Anum_pg_collation_collctype, &isnull);
+ Assert(!isnull);
+ collctype = TextDatumGetCString(datum);
+
if (strcmp(collcollate, collctype) == 0)
{
/* Normal case where they're the same */
@@ -1558,72 +1665,17 @@ pg_newlocale_from_collation(Oid collid)
}
else if (collform->collprovider == COLLPROVIDER_ICU)
{
-#ifdef USE_ICU
- UCollator *collator;
- UErrorCode status;
-
- if (strcmp(collcollate, collctype) != 0)
- ereport(ERROR,
-
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("collations with
different collate and ctype values are not supported by ICU")));
-
- status = U_ZERO_ERROR;
- collator = ucol_open(collcollate, &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("could not open
collator for locale \"%s\": %s",
- collcollate,
u_errorName(status))));
-
- if (U_ICU_VERSION_MAJOR_NUM < 54)
- icu_set_collation_attributes(collator,
collcollate);
-
- /* We will leak this string if we get an error below
:-( */
- result.info.icu.locale =
MemoryContextStrdup(TopMemoryContext,
-
collcollate);
- result.info.icu.ucol = collator;
-#else /* not USE_ICU */
- /* could get here if a collation was created by a build
with ICU */
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("ICU is not supported in this
build"), \
- errhint("You need to rebuild
PostgreSQL using %s.", "--with-icu")));
-#endif /* not USE_ICU */
+ Datum datum;
+ bool isnull;
+ const char *icucollstr;;
+
+ datum = SysCacheGetAttr(COLLOID, tp,
Anum_pg_collation_collicucoll, &isnull);
+ Assert(!isnull);
+ icucollstr = TextDatumGetCString(datum);
+ make_icu_collator(icucollstr, &result);
}
- collversion = SysCacheGetAttr(COLLOID, tp,
Anum_pg_collation_collversion,
-
&isnull);
- if (!isnull)
- {
- char *actual_versionstr;
- char *collversionstr;
-
- actual_versionstr =
get_collation_actual_version(collform->collprovider, collcollate);
- if (!actual_versionstr)
- {
- /*
- * This could happen when specifying a version
in CREATE
- * COLLATION for a libc locale, or manually
creating a mess in
- * the catalogs.
- */
- ereport(ERROR,
- (errmsg("collation \"%s\" has
no actual version, but a version was specified",
-
NameStr(collform->collname))));
- }
- collversionstr = TextDatumGetCString(collversion);
-
- if (strcmp(actual_versionstr, collversionstr) != 0)
- ereport(WARNING,
- (errmsg("collation \"%s\" has
version mismatch",
-
NameStr(collform->collname)),
- errdetail("The collation in
the database was created using version %s, "
- "but the
operating system provides version %s.",
-
collversionstr, actual_versionstr),
- errhint("Rebuild all objects
affected by this collation and run "
- "ALTER
COLLATION %s REFRESH VERSION, "
- "or build
PostgreSQL with the right library version.",
-
quote_qualified_identifier(get_namespace_name(collform->collnamespace),
-
NameStr(collform->collname)))));
- }
+ check_collation_version(tp);
ReleaseSysCache(tp);
@@ -1646,6 +1698,17 @@ get_collation_actual_version(char collprovider, const
char *collcollate)
{
char *collversion = NULL;
+ if (collprovider == COLLPROVIDER_DEFAULT)
+ {
+#ifdef USE_ICU
+ if (default_locale.provider == COLLPROVIDER_ICU)
+ collversion =
get_collation_actual_version(default_locale.provider,
+
default_locale.info.icu.locale);
+ else
+#endif
+ collversion = NULL;
+ }
+ else
#ifdef USE_ICU
if (collprovider == COLLPROVIDER_ICU)
{
diff --git a/src/backend/utils/adt/varchar.c b/src/backend/utils/adt/varchar.c
index 8fc84649f1..7eb9e59a2c 100644
--- a/src/backend/utils/adt/varchar.c
+++ b/src/backend/utils/adt/varchar.c
@@ -750,7 +750,7 @@ bpchareq(PG_FUNCTION_ARGS)
len2 = bcTruelen(arg2);
if (lc_collate_is_c(collid) ||
- collid == DEFAULT_COLLATION_OID ||
+ (collid == DEFAULT_COLLATION_OID &&
default_locale.deterministic) ||
pg_newlocale_from_collation(collid)->deterministic)
{
/*
@@ -790,7 +790,7 @@ bpcharne(PG_FUNCTION_ARGS)
len2 = bcTruelen(arg2);
if (lc_collate_is_c(collid) ||
- collid == DEFAULT_COLLATION_OID ||
+ (collid == DEFAULT_COLLATION_OID &&
default_locale.deterministic) ||
pg_newlocale_from_collation(collid)->deterministic)
{
/*
@@ -996,8 +996,13 @@ hashbpchar(PG_FUNCTION_ARGS)
keydata = VARDATA_ANY(key);
keylen = bcTruelen(key);
- if (!lc_collate_is_c(collid) && collid != DEFAULT_COLLATION_OID)
- mylocale = pg_newlocale_from_collation(collid);
+ if (!lc_collate_is_c(collid))
+ {
+ if (collid != DEFAULT_COLLATION_OID)
+ mylocale = pg_newlocale_from_collation(collid);
+ else if (default_locale.provider == COLLPROVIDER_ICU)
+ mylocale = &default_locale;
+ }
if (!mylocale || mylocale->deterministic)
{
@@ -1056,8 +1061,13 @@ hashbpcharextended(PG_FUNCTION_ARGS)
keydata = VARDATA_ANY(key);
keylen = bcTruelen(key);
- if (!lc_collate_is_c(collid) && collid != DEFAULT_COLLATION_OID)
- mylocale = pg_newlocale_from_collation(collid);
+ if (!lc_collate_is_c(collid))
+ {
+ if (collid != DEFAULT_COLLATION_OID)
+ mylocale = pg_newlocale_from_collation(collid);
+ else if (default_locale.provider == COLLPROVIDER_ICU)
+ mylocale = &default_locale;
+ }
if (!mylocale || mylocale->deterministic)
{
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index bd3091bbfb..5492c85f36 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -1200,8 +1200,13 @@ text_position_setup(text *t1, text *t2, Oid collid,
TextPositionState *state)
check_collation_set(collid);
- if (!lc_collate_is_c(collid) && collid != DEFAULT_COLLATION_OID)
- mylocale = pg_newlocale_from_collation(collid);
+ if (!lc_collate_is_c(collid))
+ {
+ if (collid != DEFAULT_COLLATION_OID)
+ mylocale = pg_newlocale_from_collation(collid);
+ else if (default_locale.provider == COLLPROVIDER_ICU)
+ mylocale = &default_locale;
+ }
if (mylocale && !mylocale->deterministic)
ereport(ERROR,
@@ -1560,6 +1565,8 @@ varstr_cmp(const char *arg1, int len1, const char *arg2,
int len2, Oid collid)
if (collid != DEFAULT_COLLATION_OID)
mylocale = pg_newlocale_from_collation(collid);
+ else if (default_locale.provider == COLLPROVIDER_ICU)
+ mylocale = &default_locale;
/*
* memcmp() can't tell us which of two unequal strings sorts
first,
@@ -1781,7 +1788,7 @@ texteq(PG_FUNCTION_ARGS)
check_collation_set(collid);
if (lc_collate_is_c(collid) ||
- collid == DEFAULT_COLLATION_OID ||
+ (collid == DEFAULT_COLLATION_OID &&
default_locale.deterministic) ||
pg_newlocale_from_collation(collid)->deterministic)
{
Datum arg1 = PG_GETARG_DATUM(0);
@@ -1835,7 +1842,7 @@ textne(PG_FUNCTION_ARGS)
check_collation_set(collid);
if (lc_collate_is_c(collid) ||
- collid == DEFAULT_COLLATION_OID ||
+ (collid == DEFAULT_COLLATION_OID &&
default_locale.deterministic) ||
pg_newlocale_from_collation(collid)->deterministic)
{
Datum arg1 = PG_GETARG_DATUM(0);
@@ -1947,8 +1954,13 @@ text_starts_with(PG_FUNCTION_ARGS)
check_collation_set(collid);
- if (!lc_collate_is_c(collid) && collid != DEFAULT_COLLATION_OID)
- mylocale = pg_newlocale_from_collation(collid);
+ if (!lc_collate_is_c(collid))
+ {
+ if (collid != DEFAULT_COLLATION_OID)
+ mylocale = pg_newlocale_from_collation(collid);
+ else if (default_locale.provider == COLLPROVIDER_ICU)
+ mylocale = &default_locale;
+ }
if (mylocale && !mylocale->deterministic)
ereport(ERROR,
@@ -2063,6 +2075,8 @@ varstr_sortsupport(SortSupport ssup, Oid typid, Oid
collid)
*/
if (collid != DEFAULT_COLLATION_OID)
locale = pg_newlocale_from_collation(collid);
+ else if (default_locale.provider == COLLPROVIDER_ICU)
+ locale = &default_locale;
/*
* There is a further exception on Windows. When the database
diff --git a/src/backend/utils/init/postinit.c
b/src/backend/utils/init/postinit.c
index 7292e51f7d..2319166e91 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -30,6 +30,7 @@
#include "catalog/catalog.h"
#include "catalog/namespace.h"
#include "catalog/pg_authid.h"
+#include "catalog/pg_collation.h"
#include "catalog/pg_database.h"
#include "catalog/pg_db_role_setting.h"
#include "catalog/pg_tablespace.h"
@@ -53,6 +54,7 @@
#include "storage/sync.h"
#include "tcop/tcopprot.h"
#include "utils/acl.h"
+#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/guc.h"
#include "utils/memutils.h"
@@ -306,6 +308,8 @@ CheckMyDatabase(const char *name, bool am_superuser, bool
override_allow_connect
{
HeapTuple tup;
Form_pg_database dbform;
+ Datum datum;
+ bool isnull;
char *collate;
char *ctype;
@@ -389,8 +393,12 @@ CheckMyDatabase(const char *name, bool am_superuser, bool
override_allow_connect
PGC_BACKEND, PGC_S_DYNAMIC_DEFAULT);
/* assign locale variables */
- collate = NameStr(dbform->datcollate);
- ctype = NameStr(dbform->datctype);
+ datum = SysCacheGetAttr(DATABASEOID, tup, Anum_pg_database_datcollate,
&isnull);
+ Assert(!isnull);
+ collate = TextDatumGetCString(datum);
+ datum = SysCacheGetAttr(DATABASEOID, tup, Anum_pg_database_datctype,
&isnull);
+ Assert(!isnull);
+ ctype = TextDatumGetCString(datum);
if (pg_perm_setlocale(LC_COLLATE, collate) == NULL)
ereport(FATAL,
@@ -406,6 +414,31 @@ CheckMyDatabase(const char *name, bool am_superuser, bool
override_allow_connect
" which is not recognized by
setlocale().", ctype),
errhint("Recreate the database with another
locale or install the missing locale.")));
+ if (dbform->datcollprovider == COLLPROVIDER_ICU)
+ {
+ datum = SysCacheGetAttr(DATABASEOID, tup,
Anum_pg_database_daticucoll, &isnull);
+ Assert(!isnull);
+ make_icu_collator(TextDatumGetCString(datum), &default_locale);
+ }
+
+ default_locale.provider = dbform->datcollprovider;
+ /*
+ * Default locale is currently always deterministic. Nondeterministic
+ * locales currently don't support pattern matching, which would break a
+ * lot of things if applied globally.
+ */
+ default_locale.deterministic = true;
+
+ {
+ HeapTuple tp;
+
+ tp = SearchSysCache1(COLLOID,
ObjectIdGetDatum(DEFAULT_COLLATION_OID));
+ if (!HeapTupleIsValid(tp))
+ elog(ERROR, "cache lookup failed for collation %u",
DEFAULT_COLLATION_OID);
+ check_collation_version(tp);
+ ReleaseSysCache(tp);
+ }
+
/* Make the locale settings visible as GUC variables, too */
SetConfigOption("lc_collate", collate, PGC_INTERNAL, PGC_S_OVERRIDE);
SetConfigOption("lc_ctype", ctype, PGC_INTERNAL, PGC_S_OVERRIDE);
diff --git a/src/bin/initdb/Makefile b/src/bin/initdb/Makefile
index a620a5bea0..993d2fa7a3 100644
--- a/src/bin/initdb/Makefile
+++ b/src/bin/initdb/Makefile
@@ -62,6 +62,8 @@ clean distclean maintainer-clean:
# ensure that changes in datadir propagate into object file
initdb.o: initdb.c $(top_builddir)/src/Makefile.global
+export with_icu
+
check:
$(prove_check)
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 03b80f9575..9ee0037d50 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -131,6 +131,8 @@ static char *lc_monetary = NULL;
static char *lc_numeric = NULL;
static char *lc_time = NULL;
static char *lc_messages = NULL;
+static char collation_provider[] = {COLLPROVIDER_LIBC, '\0'};
+static char *icu_locale = NULL;
static const char *default_text_search_config = NULL;
static char *username = NULL;
static bool pwprompt = false;
@@ -1404,6 +1406,12 @@ bootstrap_template1(void)
bki_lines = replace_token(bki_lines, "LC_CTYPE",
escape_quotes_bki(lc_ctype));
+ bki_lines = replace_token(bki_lines, "ICUCOLL",
+
escape_quotes_bki(collation_provider[0] == COLLPROVIDER_ICU ? icu_locale :
"_null_"));
+
+ bki_lines = replace_token(bki_lines, "COLLPROVIDER",
+ collation_provider);
+
/* Also ensure backend isn't confused by this environment var: */
unsetenv("PGCLIENTENCODING");
@@ -1586,6 +1594,12 @@ setup_description(FILE *cmdfd)
static void
setup_collation(FILE *cmdfd)
{
+ /*
+ * Set version of the default collation.
+ */
+ PG_CMD_PRINTF("UPDATE pg_collation SET collversion =
pg_collation_actual_version(oid) WHERE oid = %d;\n\n",
+ DEFAULT_COLLATION_OID);
+
/*
* Add an SQL-standard name. We don't want to pin this, so it doesn't
go
* in pg_collation.h. But add it before reading system collations, so
@@ -1839,8 +1853,6 @@ make_template0(FILE *cmdfd)
{
const char *const *line;
static const char *const template0_setup[] = {
- "CREATE DATABASE template0 IS_TEMPLATE = true ALLOW_CONNECTIONS
= false;\n\n",
-
/*
* We use the OID of template0 to determine datlastsysoid
*/
@@ -1865,6 +1877,9 @@ make_template0(FILE *cmdfd)
NULL
};
+ PG_CMD_PRINTF("CREATE DATABASE template0 IS_TEMPLATE = true
ALLOW_CONNECTIONS = false COLLATION_PROVIDER = %s;\n\n",
+ collation_provider[0] == COLLPROVIDER_ICU ?
"icu" : "libc");
+
for (line = template0_setup; *line; line++)
PG_CMD_PUTS(*line);
}
@@ -2136,13 +2151,14 @@ setlocales(void)
lc_monetary = locale;
if (!lc_messages)
lc_messages = locale;
+ if (!icu_locale)
+ icu_locale = locale;
}
/*
* canonicalize locale names, and obtain any missing values from our
* current environment
*/
-
check_locale_name(LC_CTYPE, lc_ctype, &canonname);
lc_ctype = canonname;
check_locale_name(LC_COLLATE, lc_collate, &canonname);
@@ -2161,6 +2177,18 @@ setlocales(void)
check_locale_name(LC_CTYPE, lc_messages, &canonname);
lc_messages = canonname;
#endif
+
+ /*
+ * If ICU is selected but no ICU locale has been given, take the
+ * lc_collate locale and chop off any encoding suffix. This should give
+ * the user a configuration that resembles their operating system's
locale
+ * setup.
+ */
+ if (collation_provider[0] == COLLPROVIDER_ICU && !icu_locale)
+ {
+ icu_locale = pg_strdup(lc_collate);
+ icu_locale[strcspn(icu_locale, ".")] = '\0';
+ }
}
/*
@@ -2176,9 +2204,12 @@ usage(const char *progname)
printf(_(" -A, --auth=METHOD default authentication method for
local connections\n"));
printf(_(" --auth-host=METHOD default authentication method for
local TCP/IP connections\n"));
printf(_(" --auth-local=METHOD default authentication method for
local-socket connections\n"));
+ printf(_(" --collation-provider={libc|icu}\n"
+ " set default collation
provider for new databases\n"));
printf(_(" [-D, --pgdata=]DATADIR location for this database
cluster\n"));
printf(_(" -E, --encoding=ENCODING set default encoding for new
databases\n"));
printf(_(" -g, --allow-group-access allow group read/execute on data
directory\n"));
+ printf(_(" --icu-locale set ICU locale for new
databases\n"));
printf(_(" -k, --data-checksums use data page checksums\n"));
printf(_(" --locale=LOCALE set default locale for new
databases\n"));
printf(_(" --lc-collate=, --lc-ctype=, --lc-messages=LOCALE\n"
@@ -2353,7 +2384,8 @@ setup_locale_encoding(void)
strcmp(lc_ctype, lc_time) == 0 &&
strcmp(lc_ctype, lc_numeric) == 0 &&
strcmp(lc_ctype, lc_monetary) == 0 &&
- strcmp(lc_ctype, lc_messages) == 0)
+ strcmp(lc_ctype, lc_messages) == 0 &&
+ (!icu_locale || strcmp(lc_ctype, icu_locale) == 0))
printf(_("The database cluster will be initialized with locale
\"%s\".\n"), lc_ctype);
else
{
@@ -2370,9 +2402,13 @@ setup_locale_encoding(void)
lc_monetary,
lc_numeric,
lc_time);
+ if (icu_locale)
+ printf(_(" ICU: %s\n"), icu_locale);
}
- if (!encoding)
+ if (!encoding && collation_provider[0] == COLLPROVIDER_ICU)
+ encodingid = PG_UTF8;
+ else if (!encoding)
{
int ctype_enc;
@@ -2876,6 +2912,8 @@ main(int argc, char *argv[])
{"data-checksums", no_argument, NULL, 'k'},
{"allow-group-access", no_argument, NULL, 'g'},
{"discard-caches", no_argument, NULL, 14},
+ {"collation-provider", required_argument, NULL, 15},
+ {"icu-locale", required_argument, NULL, 16},
{NULL, 0, NULL, 0}
};
@@ -3022,6 +3060,20 @@ main(int argc, char *argv[])
extra_options,
"-c debug_discard_caches=1");
break;
+ case 15:
+ if (strcmp(optarg, "icu") == 0)
+ collation_provider[0] =
COLLPROVIDER_ICU;
+ else if (strcmp(optarg, "libc") == 0)
+ collation_provider[0] =
COLLPROVIDER_LIBC;
+ else
+ {
+ pg_log_error("unrecognized collation
provider: %s", optarg);
+ exit(1);
+ }
+ break;
+ case 16:
+ icu_locale = pg_strdup(optarg);
+ break;
default:
/* getopt_long already emitted a complaint */
fprintf(stderr, _("Try \"%s --help\" for more
information.\n"),
diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
index 6796d8520e..6b3208a03d 100644
--- a/src/bin/initdb/t/001_initdb.pl
+++ b/src/bin/initdb/t/001_initdb.pl
@@ -11,7 +11,7 @@
use File::stat qw{lstat};
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
-use Test::More tests => 22;
+use Test::More tests => 24;
my $tempdir = PostgreSQL::Test::Utils::tempdir;
my $xlogdir = "$tempdir/pgxlog";
@@ -92,3 +92,19 @@
ok(check_mode_recursive($datadir_group, 0750, 0640),
'check PGDATA permissions');
}
+
+# Collation provider tests
+
+if ($ENV{with_icu} eq 'yes')
+{
+ command_ok(['initdb', '--no-sync', '--collation-provider=icu',
"$tempdir/data2"],
+ 'collation provider ICU');
+}
+else
+{
+ command_fails(['initdb', '--no-sync', '--collation-provider=icu',
"$tempdir/data2"],
+ 'collation provider ICU fails since no ICU
support');
+}
+
+command_fails(['initdb', '--no-sync', '--collation-provider=xyz',
"$tempdir/dataX"],
+ 'fails for invalid collation provider');
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index b52f3ccda2..7c6af8d2ef 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -2740,6 +2740,7 @@ dumpDatabase(Archive *fout)
i_datname,
i_dba,
i_encoding,
+ i_datcollprovider,
i_collate,
i_ctype,
i_frozenxid,
@@ -2755,6 +2756,7 @@ dumpDatabase(Archive *fout)
const char *datname,
*dba,
*encoding,
+ *datcollprovider,
*collate,
*ctype,
*datistemplate,
@@ -2774,6 +2776,7 @@ dumpDatabase(Archive *fout)
appendPQExpBuffer(dbQry, "SELECT tableoid, oid, datname, "
"(%s datdba) AS dba, "
"pg_encoding_to_char(encoding) AS encoding, "
+ "datcollprovider, "
"datcollate, datctype,
datfrozenxid, datminmxid, "
"datacl, acldefault('d',
datdba) AS acldefault, "
"datistemplate, datconnlimit,
"
@@ -2807,6 +2810,7 @@ dumpDatabase(Archive *fout)
i_datname = PQfnumber(res, "datname");
i_dba = PQfnumber(res, "dba");
i_encoding = PQfnumber(res, "encoding");
+ i_datcollprovider = PQfnumber(res, "datcollprovider");
i_collate = PQfnumber(res, "datcollate");
i_ctype = PQfnumber(res, "datctype");
i_frozenxid = PQfnumber(res, "datfrozenxid");
@@ -2822,6 +2826,7 @@ dumpDatabase(Archive *fout)
datname = PQgetvalue(res, 0, i_datname);
dba = PQgetvalue(res, 0, i_dba);
encoding = PQgetvalue(res, 0, i_encoding);
+ datcollprovider = PQgetvalue(res, 0, i_datcollprovider);
collate = PQgetvalue(res, 0, i_collate);
ctype = PQgetvalue(res, 0, i_ctype);
frozenxid = atooid(PQgetvalue(res, 0, i_frozenxid));
@@ -2847,6 +2852,17 @@ dumpDatabase(Archive *fout)
appendPQExpBufferStr(creaQry, " ENCODING = ");
appendStringLiteralAH(creaQry, encoding, fout);
}
+ if (strlen(datcollprovider) > 0)
+ {
+ appendPQExpBufferStr(creaQry, " COLLATION_PROVIDER = ");
+ if (datcollprovider[0] == 'c')
+ appendPQExpBufferStr(creaQry, "libc");
+ else if (datcollprovider[0] == 'i')
+ appendPQExpBufferStr(creaQry, "icu");
+ else
+ fatal("unrecognized collation provider: %s",
+ datcollprovider);
+ }
if (strlen(collate) > 0 && strcmp(collate, ctype) == 0)
{
appendPQExpBufferStr(creaQry, " LOCALE = ");
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index c28788e84f..0d710bd47e 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -896,6 +896,18 @@ listAllDbs(const char *pattern, bool verbose)
gettext_noop("Encoding"),
gettext_noop("Collate"),
gettext_noop("Ctype"));
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ " d.daticucoll as
\"%s\",\n"
+ " CASE
d.datcollprovider WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS \"%s\",\n",
+ gettext_noop("ICU Collation"),
+ gettext_noop("Coll.
Provider"));
+ else
+ appendPQExpBuffer(&buf,
+ " d.datcollate as
\"%s\",\n"
+ " 'libc' AS \"%s\",\n",
+ gettext_noop("ICU Collation"),
+ gettext_noop("Coll.
Provider"));
appendPQExpBufferStr(&buf, " ");
printACLColumn(&buf, "d.datacl");
if (verbose)
@@ -4573,7 +4585,7 @@ listCollations(const char *pattern, bool verbose, bool
showSystem)
PQExpBufferData buf;
PGresult *res;
printQueryOpt myopt = pset.popt;
- static const bool translate_columns[] = {false, false, false, false,
false, true, false};
+ static const bool translate_columns[] = {false, false, false, false,
false, false, true, false};
initPQExpBuffer(&buf);
@@ -4587,6 +4599,15 @@ listCollations(const char *pattern, bool verbose, bool
showSystem)
gettext_noop("Collate"),
gettext_noop("Ctype"));
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ",\n c.collicucoll AS
\"%s\"",
+ gettext_noop("ICU
Collation"));
+ else
+ appendPQExpBuffer(&buf,
+ ",\n c.collcollate AS
\"%s\"",
+ gettext_noop("ICU
Collation"));
+
if (pset.sversion >= 100000)
appendPQExpBuffer(&buf,
",\n CASE
c.collprovider WHEN 'd' THEN 'default' WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu'
END AS \"%s\"",
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index cf30239f6d..7db4a68df6 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2587,7 +2587,7 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH("OWNER", "TEMPLATE", "ENCODING", "TABLESPACE",
"IS_TEMPLATE",
"ALLOW_CONNECTIONS", "CONNECTION
LIMIT",
- "LC_COLLATE", "LC_CTYPE", "LOCALE");
+ "LC_COLLATE", "LC_CTYPE", "LOCALE",
"COLLATION_PROVIDER");
else if (Matches("CREATE", "DATABASE", MatchAny, "TEMPLATE"))
COMPLETE_WITH_QUERY(Query_for_list_of_template_databases);
diff --git a/src/bin/scripts/Makefile b/src/bin/scripts/Makefile
index b8d7cf2f2d..342a57d71b 100644
--- a/src/bin/scripts/Makefile
+++ b/src/bin/scripts/Makefile
@@ -53,6 +53,8 @@ clean distclean maintainer-clean:
rm -f common.o $(WIN32RES)
rm -rf tmp_check
+export with_icu
+
check:
$(prove_check)
diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
index 041454f075..1944580f36 100644
--- a/src/bin/scripts/createdb.c
+++ b/src/bin/scripts/createdb.c
@@ -38,6 +38,7 @@ main(int argc, char *argv[])
{"lc-ctype", required_argument, NULL, 2},
{"locale", required_argument, NULL, 'l'},
{"maintenance-db", required_argument, NULL, 3},
+ {"collation-provider", required_argument, NULL, 4},
{NULL, 0, NULL, 0}
};
@@ -61,6 +62,7 @@ main(int argc, char *argv[])
char *lc_collate = NULL;
char *lc_ctype = NULL;
char *locale = NULL;
+ char *collation_provider = NULL;
PQExpBufferData sql;
@@ -119,6 +121,9 @@ main(int argc, char *argv[])
case 3:
maintenance_db = pg_strdup(optarg);
break;
+ case 4:
+ collation_provider = pg_strdup(optarg);
+ break;
default:
fprintf(stderr, _("Try \"%s --help\" for more
information.\n"), progname);
exit(1);
@@ -217,6 +222,8 @@ main(int argc, char *argv[])
appendPQExpBufferStr(&sql, " LC_CTYPE ");
appendStringLiteralConn(&sql, lc_ctype, conn);
}
+ if (collation_provider)
+ appendPQExpBuffer(&sql, " COLLATION_PROVIDER %s",
collation_provider);
appendPQExpBufferChar(&sql, ';');
@@ -267,6 +274,8 @@ help(const char *progname)
printf(_("Usage:\n"));
printf(_(" %s [OPTION]... [DBNAME] [DESCRIPTION]\n"), progname);
printf(_("\nOptions:\n"));
+ printf(_(" --collation-provider={libc|icu}\n"
+ " collation provider for
the database's default collation\n"));
printf(_(" -D, --tablespace=TABLESPACE default tablespace for the
database\n"));
printf(_(" -e, --echo show the commands being sent
to the server\n"));
printf(_(" -E, --encoding=ENCODING encoding for the database\n"));
diff --git a/src/bin/scripts/t/020_createdb.pl
b/src/bin/scripts/t/020_createdb.pl
index 6bcc59de08..e1a4af384c 100644
--- a/src/bin/scripts/t/020_createdb.pl
+++ b/src/bin/scripts/t/020_createdb.pl
@@ -6,7 +6,7 @@
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
-use Test::More tests => 25;
+use Test::More tests => 28;
program_help_ok('createdb');
program_version_ok('createdb');
@@ -25,9 +25,27 @@
qr/statement: CREATE DATABASE foobar2 ENCODING 'LATIN1'/,
'create database with encoding');
+if ($ENV{with_icu} eq 'yes')
+{
+ $node->issues_sql_like(
+ [ 'createdb', '-T', 'template0', '--collation-provider=icu',
'foobar4' ],
+ qr/statement: CREATE DATABASE foobar4 .* COLLATION_PROVIDER
icu/,
+ 'create database with ICU');
+}
+else
+{
+ $node->command_fails(
+ [ 'createdb', '-T', 'template0', '--collation-provider=icu',
'foobar4' ],
+ 'create database with ICU fails since no ICU support');
+ pass;
+}
+
$node->command_fails([ 'createdb', 'foobar1' ],
'fails if database already exists');
+$node->command_fails([ 'createdb', '-T', 'template0',
'--collation-provider=xyz', 'foobarX' ],
+ 'fails for invalid collation provider');
+
# Check use of templates with shared dependencies copied from the template.
my ($ret, $stdout, $stderr) = $node->psql(
'foobar2',
diff --git a/src/include/catalog/pg_collation.dat
b/src/include/catalog/pg_collation.dat
index 6e0ab1ab4b..2a8cf2be7c 100644
--- a/src/include/catalog/pg_collation.dat
+++ b/src/include/catalog/pg_collation.dat
@@ -14,8 +14,7 @@
{ oid => '100', oid_symbol => 'DEFAULT_COLLATION_OID',
descr => 'database\'s default collation',
- collname => 'default', collprovider => 'd', collencoding => '-1',
- collcollate => '', collctype => '' },
+ collname => 'default', collprovider => 'd', collencoding => '-1' },
{ oid => '950', oid_symbol => 'C_COLLATION_OID',
descr => 'standard C collation',
collname => 'C', collprovider => 'c', collencoding => '-1',
diff --git a/src/include/catalog/pg_collation.h
b/src/include/catalog/pg_collation.h
index 03bd4cb5d4..bff59abe92 100644
--- a/src/include/catalog/pg_collation.h
+++ b/src/include/catalog/pg_collation.h
@@ -39,9 +39,10 @@ CATALOG(pg_collation,3456,CollationRelationId)
char collprovider; /* see constants below */
bool collisdeterministic BKI_DEFAULT(t);
int32 collencoding; /* encoding for this collation; -1 =
"all" */
- NameData collcollate; /* LC_COLLATE setting */
- NameData collctype; /* LC_CTYPE setting */
#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ text collcollate BKI_DEFAULT(_null_); /* LC_COLLATE
setting */
+ text collctype BKI_DEFAULT(_null_); /* LC_CTYPE
setting */
+ text collicucoll BKI_DEFAULT(_null_); /* ICU
collation string */
text collversion BKI_DEFAULT(_null_); /*
provider-dependent
* version of collation
* data */
@@ -75,6 +76,7 @@ extern Oid CollationCreate(const char *collname, Oid
collnamespace,
bool
collisdeterministic,
int32 collencoding,
const char
*collcollate, const char *collctype,
+ const char *collicucoll,
const char *collversion,
bool if_not_exists,
bool quiet);
diff --git a/src/include/catalog/pg_database.dat
b/src/include/catalog/pg_database.dat
index b8aa1364a0..6c62efcc54 100644
--- a/src/include/catalog/pg_database.dat
+++ b/src/include/catalog/pg_database.dat
@@ -15,7 +15,7 @@
{ oid => '1', oid_symbol => 'TemplateDbOid',
descr => 'default template for new databases',
datname => 'template1', encoding => 'ENCODING', datcollate => 'LC_COLLATE',
- datctype => 'LC_CTYPE', datistemplate => 't', datallowconn => 't',
+ datctype => 'LC_CTYPE', daticucoll => 'ICUCOLL', datcollprovider =>
'COLLPROVIDER', datistemplate => 't', datallowconn => 't',
datconnlimit => '-1', datlastsysoid => '0', datfrozenxid => '0',
datminmxid => '1', dattablespace => 'pg_default', datacl => '_null_' },
diff --git a/src/include/catalog/pg_database.h
b/src/include/catalog/pg_database.h
index 43f3beb6a3..9280850185 100644
--- a/src/include/catalog/pg_database.h
+++ b/src/include/catalog/pg_database.h
@@ -40,11 +40,8 @@ CATALOG(pg_database,1262,DatabaseRelationId)
BKI_SHARED_RELATION BKI_ROWTYPE_OID
/* character encoding */
int32 encoding;
- /* LC_COLLATE setting */
- NameData datcollate;
-
- /* LC_CTYPE setting */
- NameData datctype;
+ /* see pg_collation.collprovider */
+ char datcollprovider;
/* allowed as CREATE DATABASE template? */
bool datistemplate;
@@ -68,6 +65,15 @@ CATALOG(pg_database,1262,DatabaseRelationId)
BKI_SHARED_RELATION BKI_ROWTYPE_OID
Oid dattablespace BKI_LOOKUP(pg_tablespace);
#ifdef CATALOG_VARLEN /* variable-length fields start here */
+ /* LC_COLLATE setting */
+ text datcollate BKI_FORCE_NOT_NULL;
+
+ /* LC_CTYPE setting */
+ text datctype BKI_FORCE_NOT_NULL;
+
+ /* ICU collation */
+ text daticucoll;
+
/* access permissions */
aclitem datacl[1];
#endif
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index 2946f46c76..19478e573f 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -101,6 +101,12 @@ struct pg_locale_struct
typedef struct pg_locale_struct *pg_locale_t;
+extern struct pg_locale_struct default_locale;
+
+extern void make_icu_collator(const char *icucollstr,
+ struct
pg_locale_struct *resultp);
+extern void check_collation_version(HeapTuple colltuple);
+
extern pg_locale_t pg_newlocale_from_collation(Oid collid);
extern char *get_collation_actual_version(char collprovider, const char
*collcollate);
diff --git a/src/test/regress/expected/collate.icu.utf8.out
b/src/test/regress/expected/collate.icu.utf8.out
index 70133df804..3d9647b597 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1029,14 +1029,12 @@ CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
ERROR: collation "test0" already exists
do $$
BEGIN
- EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
- ', lc_ctype = ' ||
- quote_literal(current_setting('lc_ctype')) || ');';
+ EXECUTE 'CREATE COLLATION test1 (provider = icu, locale = ' ||
+ quote_literal(current_setting('lc_collate')) || ');';
END
$$;
-CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail,
need lc_ctype
-ERROR: parameter "lc_ctype" must be specified
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail,
needs "locale"
+ERROR: parameter "locale" must be specified
CREATE COLLATION testx (provider = icu, locale = 'nonsense'); /* never fails
with ICU */ DROP COLLATION testx;
CREATE COLLATION test4 FROM nonsense;
ERROR: collation "nonsense" for encoding "UTF8" does not exist
diff --git a/src/test/regress/sql/collate.icu.utf8.sql
b/src/test/regress/sql/collate.icu.utf8.sql
index 9cee3d0042..0677ba56e4 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -366,13 +366,11 @@ CREATE SCHEMA test_schema;
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
do $$
BEGIN
- EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
- ', lc_ctype = ' ||
- quote_literal(current_setting('lc_ctype')) || ');';
+ EXECUTE 'CREATE COLLATION test1 (provider = icu, locale = ' ||
+ quote_literal(current_setting('lc_collate')) || ');';
END
$$;
-CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail,
need lc_ctype
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail,
needs "locale"
CREATE COLLATION testx (provider = icu, locale = 'nonsense'); /* never fails
with ICU */ DROP COLLATION testx;
CREATE COLLATION test4 FROM nonsense;
base-commit: 8112bcf0cc602e00e95eab6c4bdc0eb73b5b547d
--
2.34.1