On Sat, Sep 25, 2021 at 4:11 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > Etsuro Fujita <etsuro.fuj...@gmail.com> writes: > > One thing I noticed is that collatable operators/functions sent to the > > remote might also cause an unexpected result when the default > > collations are not compatible. Consider this example (even with your > > patch): > > ... > > where ft1 is a foreign table with an integer column c1. As shown > > above, the sort using the collatable function chr() is performed > > remotely, so the select query might produce the result in an > > unexpected sort order when the default collations are not compatible. > > I don't think there's anything really new there --- it's still assuming > that COLLATE "default" means the same locally and remotely.
I thought that the example showed that we would need to specify a collation per-operation, not only per-foreign-table-column, like “ORDER BY chr(c1) COLLATE “foo”” where “foo” is the actual name of a local collation matching the local server’s default collation, when the default collation doesn’t match the remote server’s default collation, to avoid pushing down operations incorrectly as in the example. > As a short-term answer, I propose that we apply (and back-patch) the > attached documentation changes. The attached patch looks good to me. > Longer-term, it seems like we really have to be able to represent > the notion of a remote column that has an "unknown" collation (that > is, one that doesn't match any local collation, or at least is not > known to do so). +1 > A rough sketch for making this happen is: > > 1. Create a built-in "unknown" entry in pg_collation. Insert some > hack or other to prevent this from being applied to any real, local > column; but allow foreign-table columns to have it. > > 2. Apply mods, probably fairly similar to my patch, that prevent > postgres_fdw from believing that "unknown" matches any local > collation. (Hm, actually maybe no special code change will be > needed here, once "unknown" has its own OID?) > > 3. Change postgresImportForeignSchema so that it can substitute > the "unknown" collation at need. The exact rules for this could > be debated depending on whether you'd rather prioritize safety or > ease-of-use, but I think at least we should use "unknown" whenever > import_collate is turned off. Perhaps there should be an option > to substitute it for remote "default" as well. (Further down the > road, perhaps that could be generalized to allow a user-controlled > mapping from remote to local collations.) In addition, a) we should detect whether local “default” matches remote “default”, and b) if not, we should prevent pushing down sort/comparison operations using collatable functions/operators like “ORDER BY chr(c1)” in the example (and pushing down those operations on foreign-table columns labeled with “COLLATE default” if such labeling is allowed)? Best regards, Etsuro Fujita