Joshua Tolley <eggyk...@gmail.com> writes: >> >> ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id =3D y.id AND x.id= 2 =3D y.id2) >> >=20 >> it says X and Y ... the selectivity of joins are what i am most >> interested in. cross correlation of columns within the same table are >> just a byproduct. the core thing is: how can i estimate the number >> of rows returned from a join? > > All the discussion of this topic that I've seen has been limited to the s= ingle > table case. The hard problem in that case is coming up with something you= can > precalculate that will actually be useful during query planning, without > taking too much disk, memory, CPU, or something else. Expanding the discu= ssion > to include join relations certainly still has valid use cases, but is even > harder, because you've also got to keep track of precisely how the underl= ying > relations are joined, so you know in what context the statistics remain v= alid.
Well I've been proposing to handle the correlation problem in another way in some past mails here, and I've been trying to write it down too: http://archives.postgresql.org/pgsql-performance/2009-06/msg00118.php http://tapoueh.org/char10.html#sec13 What I propose is to extend ANALYZE to be able to work on a VIEW too, rather than just a table. The hard parts seems to be: a. what stats to record, exploiting the view definition the best we can b. how to match a user query against the view definitions we have in order to actually use the stats If you have answers or good ideas=C2=A0:) Regards, --=20 dim -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers