On 31 October 2018 at 08:52, Tom Lane <t...@sss.pgh.pa.us> wrote: > David Rowley <david.row...@2ndquadrant.com> writes: >> I've started working on something I've ended up calling "Super >> PathKeys". The idea here is to increase the likelihood of a Path with >> PathKeys being used for a purpose that requires a less strict sort >> order due to ordering being required from the return value of some >> precision loss function. > > I'm a little confused by the terminology here, or why this has anything > at all to do with a new sort of pathkey. It seems like the idea you're > driving at is to be able to mark a function as being order-preserving, > in the sense that if one input is known sorted then the output will > also be sorted (by the same or a related opclass). You probably need > some additional constraints, like any other inputs being constants, > before that really works. But given that, I don't see why you need a > new kind of pathkey: you just have a new way to conclude that some > path is sorted by the pathkey you already want.
Thanks for chipping in on this. The additional pathkeys are not required to make it work, they're just required to make it work efficiently. The fact that we could to the trouble of making pathkeys canonical so we can perform pointer comparison rather than using equals() says to me that I'd better not do anything to slow this down too much. Doing it without the superkey idea seems to require quite a bit of analysis during pathkeys_contained_in() as we'd need to check for superkeys then, instead of when we're building the pathkey in the first place. As for the code that I did add to pathkeys_contained_in(), I doubt it's measurably slower for the normal case. The other fields being Const part I did miss. That will also be a requirement. I just failed to consider that date_trunc() could be used with a variable 1st arg. > Maybe if I read the patch it'd be clearer why you want to describe it > that way, but I'm too lazy to do that right now. One thing I would > say though is that a pg_proc column identifying the interesting input > parameter is insufficient; you'd need to *explicitly* say which opclass(es) > the sorting behavior guarantee applies for. > >> -- Test a more complex case where the superkey can be matched to >> multiple pathkeys >> explain (costs off) select date_trunc('year', ts), date_trunc('month', >> ts), a, count(*) from tstbl group by 1,2,3 order by 1,2,3; >> QUERY PLAN >> ----------------------------------------------------------------------------- >> GroupAggregate >> Group Key: date_trunc('year'::text, ts), date_trunc('month'::text, ts), a >> -> Index Only Scan using tstbl_ts_a_idx on tstbl >> (3 rows) > > [ squint... ] Does that really work? If so, how? It would need a whole > lot more knowledge about the behavior of date_trunc than I think could > possibly be reasonable to encode in a general mechanism. I'm not entirely certain we can always consume multiple matching keys or if it has to be one for one. However, I get an empty diff from each of the following two query pairs. select date_trunc('month'::text,date),date_Trunc('year', date) from dt order by dt; select date_trunc('month'::text,date),date_Trunc('year', date) from dt order by 1,2; and select date_trunc('year'::text,date),date_Trunc('month', date) from dt order by dt; select date_trunc('year'::text,date),date_Trunc('month', date) from dt order by 1,2; with setup: create table dt (date date); insert into dt select d from generate_series('2018-01-01', '2018-12-31', '1 day'::interval) d; -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services