Hi!

My colleague reviewed my patch and gave feedback on how to improve it - for some queries with data types that I did not consider, pull-up is not applied, although it should. Some of them:

EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF)
SELECT 1
  FROM ta
 WHERE EXISTS (SELECT 1
                 FROM tb
                 JOIN tc
                   ON ta.id = tb.id
                  AND tb.id = ANY('{1}'::int[])
              );

                                QUERY PLAN
--------------------------------------------------------------------------
 Seq Scan on ta (actual rows=1.00 loops=1)
   Filter: EXISTS(SubPlan 1)
   Rows Removed by Filter: 1
   SubPlan 1
     ->  Nested Loop (actual rows=0.50 loops=2)
           ->  Seq Scan on tb (actual rows=0.50 loops=2)
                 Filter: ((id = ANY ('{1}'::integer[])) AND (ta.id = id))
                 Rows Removed by Filter: 2
           ->  Seq Scan on tc (actual rows=1.00 loops=1)

EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF)
 SELECT 1
   FROM ta
  WHERE EXISTS (SELECT 1
                  FROM tb
                  JOIN tc
                    ON ta.id = tb.id
                   AND tb.is_active
               );
                       QUERY PLAN
---------------------------------------------------------
 Seq Scan on ta (actual rows=2.00 loops=1)
   Filter: EXISTS(SubPlan 1)
   SubPlan 1
     ->  Nested Loop (actual rows=1.00 loops=2)
           ->  Seq Scan on tb (actual rows=1.00 loops=2)
                 Filter: (is_active AND (ta.id = id))
                 Rows Removed by Filter: 0
           ->  Seq Scan on tc (actual rows=1.00 loops=2)

EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF)
 SELECT 1
   FROM ta
  WHERE EXISTS (SELECT 1
                  FROM tb
                  JOIN tc
                    ON ta.id = tb.id
                   AND tb.is_active IS NOT NULL
               );

                                   QUERY PLAN
--------------------------------------------------------------------
 Seq Scan on ta (actual rows=2.00 loops=1)
   Filter: EXISTS(SubPlan 1)
    SubPlan 1
       ->  Nested Loop (actual rows=1.00 loops=2)
             ->  Seq Scan on tb (actual rows=1.00 loops=2)
                   Filter: ((is_active IS NOT NULL) AND (ta.id = id))
                   Rows Removed by Filter: 0
             ->  Seq Scan on tc (actual rows=1.00 loops=2)

UPDATE tb SET is_active = NULL WHERE id = 2;

EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF)
 SELECT 1
   FROM ta
  WHERE EXISTS (SELECT 1
                  FROM tb
                  JOIN tc
                    ON ta.id = tb.id
                   AND tb.is_active IS NULL
               );
                              QUERY PLAN
----------------------------------------------------------------
 Seq Scan on ta (actual rows=1.00 loops=1)
   Filter: EXISTS(SubPlan 1)
   Rows Removed by Filter: 1
   SubPlan 1
     ->  Nested Loop (actual rows=0.50 loops=2)
           ->  Seq Scan on tb (actual rows=0.50 loops=2)
                 Filter: ((is_active IS NULL) AND (ta.id = id))
                 Rows Removed by Filter: 4
           ->  Seq Scan on tc (actual rows=1.00 loops=1)

I see that I need to add a walker that, when traversing the tree, determines whether there are conditions under which pull-up is impossible - the presence of volatility of functions and other restrictions, and leave the transformation for the var objects that I added before, I described it here.

Unfortunately, I need a few days to implement this and need time for a review, and I think I will not have time to do this before the code freeze, so I am moving this to the next commitfest and not changing the status "awaiting the author".

On 11.02.2025 18:59, Alena Rybakina wrote:
On 10.02.2025 23:51, Ilia Evdokimov wrote:

On 09.02.2025 18:14, Alena Rybakina wrote:
Hi! I found another example where the transformation worked incorrectly and reconsidered the idea.

As for conversion of exists_sublink_to_ANY, we need to get the flattened implicit-AND list of clauses and pull out the chunks of the WHERE clause that belong to the parent query, since we are called halfway through the parent's preprocess_expression() and earlier steps of preprocess_expression() wouldn't get applied to the pulled-up stuff unless we do them here. We also do some processing for vars depending on which side the var is on - if it's in a subquery, we only need to lower its level (varlevel) because subquery will be flatted, while for other vars that belong to the parent query, we need to do preparation to pull up the sub-select into top range table.

For those expressions that we couldn't assign to either list, we define newWhere and apply both cases.


When I run 'make -C contrib/ check', tests of postgres_fdw extension failed. I might be wrong, but you should be careful with LIMIT.

Thank you for the review, I'm working on it.

Sorry for not responding, but I will fix this bug after I update the code based on the comments above. Thank you for noticing and writing to me, your feedback is very important.

--
Regards,
Alena Rybakina
Postgres Professional



Reply via email to