Alvaro Herrera <alvhe...@2ndquadrant.com> writes: > I notice that some information_schema view columns end up with C > collation after this patch, and others remain with default collation. > Is that sensible? (I think the only two cases where this might matter > at all are information_schema.parameters.parameter_name, > information_schema.routines.external_name and > information_schema.foreign_servers.foreign_server_type.)
Yeah. Looking closer at that, there are no collation-sensitive indexes in information_schema (if there were, the existing opr_sanity test would have caught 'em). But there are collation-sensitive table columns, which do have pg_statistic entries, and those entries are at least nominally broken by copying them into a database with a different default collation. We could think of two ways to deal with that. One is to plaster COLLATE "C" on each textual table column in the information_schema. A more aggressive approach is to attach COLLATE "C" to each of the domain types that information_schema defines, which fixes the table columns a fortiori, and also causes all of the exposed information_schema view columns to acquire database-independent collations. I tried both ways, as in the attached patches below (each meant to be applied on top of my patch upthread), and they both pass check-world. A possible advantage of the second approach is that it could end up allowing comparisons on information_schema view columns to be translated to indexable comparisons on the underlying "name" columns, which would be a pleasant outcome. On the other hand, people might be annoyed by the semantics change, if they'd previously been doing that with the expectation of getting database-collation-based comparisons. I'm not sure whether the SQL standard says anything that either patch would be violating. I see that it does say that these domains have CHARACTER SET SQL_TEXT, and that the collation of that character set is implementation-defined, so I think we could get away with changing so far as spec compliance is concerned. regards, tom lane
diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql index 6227a8f3..742e2b6 100644 --- a/src/backend/catalog/information_schema.sql +++ b/src/backend/catalog/information_schema.sql @@ -1576,13 +1576,13 @@ GRANT SELECT ON sequences TO PUBLIC; */ CREATE TABLE sql_features ( - feature_id character_data, - feature_name character_data, - sub_feature_id character_data, - sub_feature_name character_data, - is_supported yes_or_no, - is_verified_by character_data, - comments character_data + feature_id character_data COLLATE "C", + feature_name character_data COLLATE "C", + sub_feature_id character_data COLLATE "C", + sub_feature_name character_data COLLATE "C", + is_supported yes_or_no COLLATE "C", + is_verified_by character_data COLLATE "C", + comments character_data COLLATE "C" ); -- Will be filled with external data by initdb. @@ -1599,11 +1599,11 @@ GRANT SELECT ON sql_features TO PUBLIC; -- clause 9.1. CREATE TABLE sql_implementation_info ( - implementation_info_id character_data, - implementation_info_name character_data, + implementation_info_id character_data COLLATE "C", + implementation_info_name character_data COLLATE "C", integer_value cardinal_number, - character_value character_data, - comments character_data + character_value character_data COLLATE "C", + comments character_data COLLATE "C" ); INSERT INTO sql_implementation_info VALUES ('10003', 'CATALOG NAME', NULL, 'Y', NULL); @@ -1628,13 +1628,13 @@ GRANT SELECT ON sql_implementation_info TO PUBLIC; */ CREATE TABLE sql_languages ( - sql_language_source character_data, - sql_language_year character_data, - sql_language_conformance character_data, - sql_language_integrity character_data, - sql_language_implementation character_data, - sql_language_binding_style character_data, - sql_language_programming_language character_data + sql_language_source character_data COLLATE "C", + sql_language_year character_data COLLATE "C", + sql_language_conformance character_data COLLATE "C", + sql_language_integrity character_data COLLATE "C", + sql_language_implementation character_data COLLATE "C", + sql_language_binding_style character_data COLLATE "C", + sql_language_programming_language character_data COLLATE "C" ); INSERT INTO sql_languages VALUES ('ISO 9075', '1999', 'CORE', NULL, NULL, 'DIRECT', NULL); @@ -1651,11 +1651,11 @@ GRANT SELECT ON sql_languages TO PUBLIC; */ CREATE TABLE sql_packages ( - feature_id character_data, - feature_name character_data, - is_supported yes_or_no, - is_verified_by character_data, - comments character_data + feature_id character_data COLLATE "C", + feature_name character_data COLLATE "C", + is_supported yes_or_no COLLATE "C", + is_verified_by character_data COLLATE "C", + comments character_data COLLATE "C" ); INSERT INTO sql_packages VALUES ('PKG000', 'Core', 'NO', NULL, ''); @@ -1678,11 +1678,11 @@ GRANT SELECT ON sql_packages TO PUBLIC; */ CREATE TABLE sql_parts ( - feature_id character_data, - feature_name character_data, - is_supported yes_or_no, - is_verified_by character_data, - comments character_data + feature_id character_data COLLATE "C", + feature_name character_data COLLATE "C", + is_supported yes_or_no COLLATE "C", + is_verified_by character_data COLLATE "C", + comments character_data COLLATE "C" ); INSERT INTO sql_parts VALUES ('1', 'Framework (SQL/Framework)', 'NO', NULL, ''); @@ -1705,9 +1705,9 @@ INSERT INTO sql_parts VALUES ('14', 'XML-Related Specifications (SQL/XML)', 'YES CREATE TABLE sql_sizing ( sizing_id cardinal_number, - sizing_name character_data, + sizing_name character_data COLLATE "C", supported_value cardinal_number, - comments character_data + comments character_data COLLATE "C" ); INSERT INTO sql_sizing VALUES (34, 'MAXIMUM CATALOG NAME LENGTH', 63, NULL); @@ -1753,10 +1753,10 @@ GRANT SELECT ON sql_sizing TO PUBLIC; CREATE TABLE sql_sizing_profiles ( sizing_id cardinal_number, - sizing_name character_data, - profile_id character_data, + sizing_name character_data COLLATE "C", + profile_id character_data COLLATE "C", required_value cardinal_number, - comments character_data + comments character_data COLLATE "C" ); GRANT SELECT ON sql_sizing_profiles TO PUBLIC; diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out index 52cd7b9..6dca1b7 100644 --- a/src/test/regress/expected/opr_sanity.out +++ b/src/test/regress/expected/opr_sanity.out @@ -2060,7 +2060,18 @@ ORDER BY 1; -- a representational error in pg_index, but simply wrong catalog design. -- It's bad because we expect to be able to clone template0 and assign the -- copy a different database collation. It would especially not work for --- shared catalogs. Collation-sensitive indexes should have "C" collation. +-- shared catalogs. +SELECT relname, attname, attcollation +FROM pg_class c, pg_attribute a +WHERE c.oid = attrelid AND c.oid < 16384 AND + c.relkind != 'v' AND -- we don't care about columns in views + attcollation != 0 AND + attcollation != (SELECT oid FROM pg_collation WHERE collname = 'C'); + relname | attname | attcollation +---------+---------+-------------- +(0 rows) + +-- Double-check that collation-sensitive indexes have "C" collation, too. SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll FROM (SELECT indexrelid, indrelid, unnest(indclass) as iclass, unnest(indcollation) as icoll diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql index 5cf4df1..64eca7e 100644 --- a/src/test/regress/sql/opr_sanity.sql +++ b/src/test/regress/sql/opr_sanity.sql @@ -1333,7 +1333,16 @@ ORDER BY 1; -- a representational error in pg_index, but simply wrong catalog design. -- It's bad because we expect to be able to clone template0 and assign the -- copy a different database collation. It would especially not work for --- shared catalogs. Collation-sensitive indexes should have "C" collation. +-- shared catalogs. + +SELECT relname, attname, attcollation +FROM pg_class c, pg_attribute a +WHERE c.oid = attrelid AND c.oid < 16384 AND + c.relkind != 'v' AND -- we don't care about columns in views + attcollation != 0 AND + attcollation != (SELECT oid FROM pg_collation WHERE collname = 'C'); + +-- Double-check that collation-sensitive indexes have "C" collation, too. SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll FROM (SELECT indexrelid, indrelid,
diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql index 6227a8f3..0fbcfa8 100644 --- a/src/backend/catalog/information_schema.sql +++ b/src/backend/catalog/information_schema.sql @@ -208,7 +208,7 @@ CREATE DOMAIN cardinal_number AS integer * CHARACTER_DATA domain */ -CREATE DOMAIN character_data AS character varying; +CREATE DOMAIN character_data AS character varying COLLATE "C"; /* @@ -216,7 +216,7 @@ CREATE DOMAIN character_data AS character varying; * SQL_IDENTIFIER domain */ -CREATE DOMAIN sql_identifier AS character varying; +CREATE DOMAIN sql_identifier AS character varying COLLATE "C"; /* @@ -243,7 +243,7 @@ CREATE DOMAIN time_stamp AS timestamp(2) with time zone * YES_OR_NO domain */ -CREATE DOMAIN yes_or_no AS character varying(3) +CREATE DOMAIN yes_or_no AS character varying(3) COLLATE "C" CONSTRAINT yes_or_no_check CHECK (value IN ('YES', 'NO')); diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out index 52cd7b9..6dca1b7 100644 --- a/src/test/regress/expected/opr_sanity.out +++ b/src/test/regress/expected/opr_sanity.out @@ -2060,7 +2060,18 @@ ORDER BY 1; -- a representational error in pg_index, but simply wrong catalog design. -- It's bad because we expect to be able to clone template0 and assign the -- copy a different database collation. It would especially not work for --- shared catalogs. Collation-sensitive indexes should have "C" collation. +-- shared catalogs. +SELECT relname, attname, attcollation +FROM pg_class c, pg_attribute a +WHERE c.oid = attrelid AND c.oid < 16384 AND + c.relkind != 'v' AND -- we don't care about columns in views + attcollation != 0 AND + attcollation != (SELECT oid FROM pg_collation WHERE collname = 'C'); + relname | attname | attcollation +---------+---------+-------------- +(0 rows) + +-- Double-check that collation-sensitive indexes have "C" collation, too. SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll FROM (SELECT indexrelid, indrelid, unnest(indclass) as iclass, unnest(indcollation) as icoll diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql index 5cf4df1..64eca7e 100644 --- a/src/test/regress/sql/opr_sanity.sql +++ b/src/test/regress/sql/opr_sanity.sql @@ -1333,7 +1333,16 @@ ORDER BY 1; -- a representational error in pg_index, but simply wrong catalog design. -- It's bad because we expect to be able to clone template0 and assign the -- copy a different database collation. It would especially not work for --- shared catalogs. Collation-sensitive indexes should have "C" collation. +-- shared catalogs. + +SELECT relname, attname, attcollation +FROM pg_class c, pg_attribute a +WHERE c.oid = attrelid AND c.oid < 16384 AND + c.relkind != 'v' AND -- we don't care about columns in views + attcollation != 0 AND + attcollation != (SELECT oid FROM pg_collation WHERE collname = 'C'); + +-- Double-check that collation-sensitive indexes have "C" collation, too. SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll FROM (SELECT indexrelid, indrelid,