Hello hackers, I have a question regarding distributing the filter clause(baserestrictinfo) of one table into another table(Keys belong to the same EquivalenceClass).
In the following query, why PG is not copying the filter (t1.pk=1 OR t1.pk=2) into t2's baserestrictinfo? I believe PG copies those filters which are OpExpr and not BoolExpr, but still wanted to know what would be the risks if it gets copied. SELECT * FROM t1 INNER JOIN t2 ON (t1.pk = t2.pk) WHERE t1.pk = 1 OR t1.pk = 2; The filters are effectively: (t1.pk = t2.pk) AND (t1.pk = 1 OR t1.pk = 2). Can we expand this into (t1.pk = t2.pk) AND (t1.pk = 1 OR t1.pk = 2) AND ( t2.pk = 1 OR t2.pk = 2)? The above query is resulting in a Query Plan like: [Scan(t1, with filter pk = 1 OR pk = 2)] Join [Scan(t2, with Parameter t1.pk = t2.pk)] If PG copies t1's filter into t2, it could've been like this: [Scan(t1, with filter pk = 1 OR pk = 2)] Join [Scan(t2, with *filter pk = 1 OR pk = 2*)] With Postgres Table Partition, this results in more performance issues. Unneeded partitions need to be scanned, since the filters are not getting copied. Actually, in my case, both t1 and t2 are HASH partitioned with the key (pk), and with the same number of partitions and range. And running the same query results in reading only 2 partitions of t1, and all of the partitions of t2. If we could copy the filter into t2 as well, then only 2 partitions of t2 would be required to be read. What could be the reasons for NOT copying the t1's filters into t2's baserestrictinfo? If we copy that, could that result in wrong results? P.S. PlanTree for some sample queries is attached for reference. Thanks, Mohamed Insaf K
-- Create two tables CREATE TABLE table1(pk INT PRIMARY KEY) PARTITION BY HASH(pk); CREATE TABLE table2(pk INT PRIMARY KEY) PARTITION BY HASH(pk); -- Create 4 partitions for each of the two tables CREATE TABLE table1_p1 PARTITION OF table1 FOR VALUES WITH (MODULUS 4, REMAINDER 0); CREATE TABLE table1_p2 PARTITION OF table1 FOR VALUES WITH (MODULUS 4, REMAINDER 1); CREATE TABLE table1_p3 PARTITION OF table1 FOR VALUES WITH (MODULUS 4, REMAINDER 2); CREATE TABLE table1_p4 PARTITION OF table1 FOR VALUES WITH (MODULUS 4, REMAINDER 3); CREATE TABLE table2_p1 PARTITION OF table2 FOR VALUES WITH (MODULUS 4, REMAINDER 0); CREATE TABLE table2_p2 PARTITION OF table2 FOR VALUES WITH (MODULUS 4, REMAINDER 1); CREATE TABLE table2_p3 PARTITION OF table2 FOR VALUES WITH (MODULUS 4, REMAINDER 2); CREATE TABLE table2_p4 PARTITION OF table2 FOR VALUES WITH (MODULUS 4, REMAINDER 3); -- INSERT data into both of the two tables. INSERT INTO table1 SELECT generate_series(1, 8); INSERT INTO table2 SELECT generate_series(1, 8); ------------------------------------------------------------------ QueryPlan for the query with IN clause ------------------------------------------------------------------ postgres=# EXPLAIN (COSTS OFF) SELECT * FROM table1 t1 INNER JOIN table2 t2 ON (t1.pk = t2.pk) WHERE t1.pk IN (1,2); QUERY PLAN -------------------------------------------------------------------- Nested Loop -> Append -> Bitmap Heap Scan on table1_p1 t1 Recheck Cond: (pk = ANY ('{1,2}'::integer[])) -> Bitmap Index Scan on table1_p1_pkey Index Cond: (pk = ANY ('{1,2}'::integer[])) -> Bitmap Heap Scan on table1_p3 t1_1 Recheck Cond: (pk = ANY ('{1,2}'::integer[])) -> Bitmap Index Scan on table1_p3_pkey Index Cond: (pk = ANY ('{1,2}'::integer[])) -> Append -> Index Only Scan using table2_p1_pkey on table2_p1 t2 Index Cond: (pk = t1.pk) -> Index Only Scan using table2_p2_pkey on table2_p2 t2_1 Index Cond: (pk = t1.pk) -> Index Only Scan using table2_p3_pkey on table2_p3 t2_2 Index Cond: (pk = t1.pk) -> Index Only Scan using table2_p4_pkey on table2_p4 t2_3 Index Cond: (pk = t1.pk) (19 rows) ------------------------------------------------------------------------------------------------------------------------- QueryPlan for the query with OR clause ------------------------------------------------------------------------------------------------------------------------- postgres=# EXPLAIN (COSTS OFF) SELECT * FROM table1 t1 INNER JOIN table2 t2 ON (t1.pk = t2.pk) WHERE t1.pk = 1 OR t1.pk = 2; QUERY PLAN -------------------------------------------------------------------- Nested Loop -> Append -> Bitmap Heap Scan on table1_p1 t1 Recheck Cond: ((pk = 1) OR (pk = 2)) -> BitmapOr -> Bitmap Index Scan on table1_p1_pkey Index Cond: (pk = 1) -> Bitmap Index Scan on table1_p1_pkey Index Cond: (pk = 2) -> Bitmap Heap Scan on table1_p3 t1_1 Recheck Cond: ((pk = 1) OR (pk = 2)) -> BitmapOr -> Bitmap Index Scan on table1_p3_pkey Index Cond: (pk = 1) -> Bitmap Index Scan on table1_p3_pkey Index Cond: (pk = 2) -> Append -> Index Only Scan using table2_p1_pkey on table2_p1 t2 Index Cond: (pk = t1.pk) -> Index Only Scan using table2_p2_pkey on table2_p2 t2_1 Index Cond: (pk = t1.pk) -> Index Only Scan using table2_p3_pkey on table2_p3 t2_2 Index Cond: (pk = t1.pk) -> Index Only Scan using table2_p4_pkey on table2_p4 t2_3 Index Cond: (pk = t1.pk) (25 rows) ------------------------------------------------------------------ QueryPlan for the query with just a filter ------------------------------------------------------------------ postgres=# EXPLAIN (COSTS OFF) SELECT * FROM table1 t1 INNER JOIN table2 t2 ON (t1.pk = t2.pk) WHERE t1.pk = 1; QUERY PLAN ------------------------------------------------------------------ Nested Loop -> Append -> Index Only Scan using table1_p1_pkey on table1_p1 t1 Index Cond: (pk = 1) -> Append -> Index Only Scan using table2_p1_pkey on table2_p1 t2 Index Cond: (pk = 1) (7 rows)