This thread duplicates https://postgr.es/m/flat/16539.1431472961%40sss.pgh.pa.us
On Sun, Dec 30, 2018 at 01:24:02PM -0500, Tom Lane wrote: > So this coding amounts to an undocumented > assumption that every non-cross-type btree comparison function is > leakproof. > select p.oid::regprocedure from pg_proc p, pg_amproc a, pg_opfamily f > where p.oid=a.amproc and f.oid=a.amprocfamily and opfmethod=403 and not > proleakproof and amproclefttype=amprocrighttype and amprocnum=1; > > bpcharcmp(character,character) > btarraycmp(anyarray,anyarray) > btbpchar_pattern_cmp(character,character) > btoidvectorcmp(oidvector,oidvector) > btrecordcmp(record,record) > btrecordimagecmp(record,record) > bttext_pattern_cmp(text,text) > bttextcmp(text,text) > enum_cmp(anyenum,anyenum) > jsonb_cmp(jsonb,jsonb) > numeric_cmp(numeric,numeric) > pg_lsn_cmp(pg_lsn,pg_lsn) > range_cmp(anyrange,anyrange) > tsquery_cmp(tsquery,tsquery) > tsvector_cmp(tsvector,tsvector) > > so this assumption is, on its face, wrong. > > In practice it might be all right, because it's hard to see a reason why > a btree comparison function would ever throw an error except for internal > failures, which are probably outside the scope of leakproofness guarantees > anyway. Nonetheless, if we didn't mark these functions as leakproof, > why not? pg_lsn_cmp() and btoidvectorcmp() surely could advertise leakproofness. I'm not sure about enum_cmp(), numeric_cmp(), tsquery_cmp() or tsvector_cmp(). I can't think of a reason those would leak, though. btrecordcmp() and other polymorphic cmp functions can fail: create type boxrec as (a box); select '("(1,1),(0,0)")'::boxrec = '("(1,1),(0,0)")'::boxrec; => ERROR: could not identify an equality operator for type box The documentation says, "a function which throws an error message for some argument values but not others ... is not leakproof." I would be comfortable amending that to allow the "could not identify an equality operator" error, because that error follows from type specifics, not value specifics. bttextcmp() and other varstr_cmp() callers fall afoul of the same restriction with their "could not convert string to UTF-16" errors (https://postgr.es/m/CADyhKSXPwrUv%2B9LtqPAQ_gyZTv4hYbr2KwqBxcs6a3Vee1jBLQ%40mail.gmail.com). Leaking the binary fact that an unspecified string contains an unspecified rare Unicode character is not a serious leak, however. Also, those errors would be a substantial usability impediment if they happened much in practice; you couldn't index affected values. > I think that we should either change contain_leaked_vars_walker to > explicitly verify leakproofness of the comparison function, or decide > that it's project policy that btree comparison functions are leakproof, > and change the markings on those (and their associated operators). Either of those solutions sounds fine. Like last time, I'll vote for explicitly verifying leakproofness.