Hi Hackers,
 
While dealing with a few queries, I noticed that when the join expression (join 
clause) is used in projection as well, the expression will be computed twice.
For a better understanding, please take reference from the following example:


SELECT sensitive_data1, column1 FROM benchmark_encytion AS t1 LEFT JOIN ( 
SELECT aes256_cbc_decrypt( c1, '\x1234' :: bytea, '\x5678' :: bytea ) AS 
column1 FROM cipher ) AS t2 ON t1.sensitive_data1 = t2.column1;






As you can see in the above Query, the join clause involves the column which 
needs to be decrypted first and then joined with other table and then in 
projection we need the needed decrypted values to print as well, in this case 
the plan generated by the PG is as mentioned below (refer to the image as well):


                                                                                
                        QUERY PLAN                                              
                                                           

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Hash Right Join  (cost=22.74..73.43 rows=1 width=65)

   Output: t1.sensitive_data1, aes256_cbc_decrypt(cipher.c1, '4696e67'::bytea, 
'6e67'::bytea)

   Hash Cond: (aes256_cbc_decrypt(cipher.c1, '4696e67'::bytea, '6e67'::bytea) = 
t1.sensitive_data1)

   ->  Foreign Scan on public.cipher  (cost=0.00..50.68 rows=1 width=49)

         Output: cipher.c1, cipher.c2, cipher.c3, cipher.c4, cipher.c5, 
cipher.c6

         CStore Dir: 
/home/shubha/Documents/zoho/postgres17/data/cstore_fdw/116838/116931

         CStore Table Size: 2424 kB

   ->  Hash  (cost=22.72..22.72 rows=1 width=33)

         Output: t1.sensitive_data1

         ->  Foreign Scan on public.benchmark_encytion t1  (cost=0.00..22.72 
rows=1 width=33)

               Output: t1.sensitive_data1

               CStore Dir: 
/home/shubha/Documents/zoho/postgres17/data/cstore_fdw/116838/135230

               CStore Table Size: 1268 kB

Query Identifier: 1810637692808683603

(14 rows)





As seen in the plan, join clause uses aes256_cbc_decrypt funcExpr to join 
columns and we are selecting the same as projection from hasjJoin node 
resulting in computing the expr twice, which is very costly.
 
My doubt here is, while planing this join, why can't we parse the join clause 
and pass the expressions involved there to the respective scan nodes and use it 
above that wherever needed as a Var? 
In this particular case, we can push_down the expression (decrypt funcExpr) 
from join clause to the foreign scan of cipher table. Why have we not handled 
this case in PG?
Pls share your thoughts on the same, also pls correct me if my understanding is 
wrong here.










Thanks and Regards.
Shubhankar  Kulkarni
ZLabs-CStore

Reply via email to