BEFORE UPDATE trigger on postgres_fdw table not work

2019-05-26 Thread Shohei Mochizuki

Hi,

I noticed returning a modified record in a row-level BEFORE UPDATE trigger
on postgres_fdw foreign tables do not work. Attached patch fixes this issue.

Below are scenarios similar to postgres_fdw test to reproduce the issue.

postgres=# CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw OPTIONS 
(dbname 'postgres',port '5432');
postgres=# CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
postgres=# create table loc1 (f1 serial, f2 text);
postgres=# create foreign table rem1 (f1 serial, f2 text)
postgres-#   server loopback options(table_name 'loc1');

postgres=# CREATE FUNCTION trig_row_before_insupdate() RETURNS TRIGGER AS $$
postgres$#   BEGIN
postgres$# NEW.f2 := NEW.f2 || ' triggered !';
postgres$# RETURN NEW;
postgres$#   END
postgres$# $$ language plpgsql;

postgres=# CREATE TRIGGER trig_row_before_insupd BEFORE INSERT OR UPDATE ON rem1
postgres-# FOR EACH ROW EXECUTE PROCEDURE trig_row_before_insupdate();

-- insert trigger is OK
postgres=# INSERT INTO rem1 values(1, 'insert');
postgres=# SELECT * FROM rem1;
 f1 | f2
+
  1 | insert triggered !
(1 row)

-- update trigger is OK if we update f2
postgres=# UPDATE rem1 set f2 = 'update';
postgres=# SELECT * FROM rem1;
 f1 | f2
+
  1 | update triggered !


Without attached patch:

postgres=# UPDATE rem1 set f1 = 10;
postgres=# SELECT * FROM rem1;
 f1 | f2
+
 10 | update triggered !
(1 row)

f2 should be updated by trigger, but not.
This is because current fdw code adds only columns to RemoteSQL that were
explicitly targets of the UPDATE as follows.

postgres=# EXPLAIN (verbose, costs off)
UPDATE rem1 set f1 = 10;
 QUERY PLAN
-
 Update on public.rem1
   Remote SQL: UPDATE public.loc1 SET f1 = $2 WHERE ctid = $1  <--- not set f2
   ->  Foreign Scan on public.rem1
 Output: 10, f2, ctid, rem1.*
 Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
(5 rows)

With attached patch, f2 is updated by a trigger and "f2 = $3" is added to 
remote SQL
as follows.

postgres=# UPDATE rem1 set f1 = 10;
postgres=# select * from rem1;
 f1 |   f2
+
 10 | update triggered ! triggered !
(1 row)

postgres=# EXPLAIN (verbose, costs off)
postgres-# UPDATE rem1 set f1 = 10;
  QUERY PLAN
---
 Update on public.rem1
   Remote SQL: UPDATE public.loc1 SET f1 = $2, f2 = $3 WHERE ctid = $1
   ->  Foreign Scan on public.rem1
 Output: 10, f2, ctid, rem1.*
 Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
(5 rows)

My patch adds all columns to a target list of remote update query
as in INSERT case if a before update trigger exists.

I tried to add only columns modified in trigger to the target list of
a remote update query, but I cannot find simple way to do that because
update query is built during planning phase at postgresPlanForeignModify
while it is difficult to decide which columns are modified by a trigger
until query execution.

Regards,

--
Shohei Mochizuki
TOSHIBA CORPORATION
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index e034b03..546d97b 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -6558,6 +6558,14 @@ SELECT * from loc1;
   2 | skidoo triggered !
 (2 rows)
 
+UPDATE rem1 set f1 = 10;
+SELECT * from loc1;
+ f1 |   f2   
++
+ 10 | skidoo triggered ! triggered !
+ 10 | skidoo triggered ! triggered !
+(2 rows)
+
 DELETE FROM rem1;
 -- Add a second trigger, to check that the changes are propagated correctly
 -- from trigger to trigger
@@ -6670,7 +6678,7 @@ NOTICE:  trig_row_after(23, skidoo) AFTER ROW INSERT ON 
rem1
 NOTICE:  NEW: (13,"test triggered !")
   ctid  
 
- (0,27)
+ (0,29)
 (1 row)
 
 -- cleanup
@@ -6774,10 +6782,10 @@ BEFORE UPDATE ON rem1
 FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
 EXPLAIN (verbose, costs off)
 UPDATE rem1 set f2 = '';  -- can't be pushed down
- QUERY PLAN  
--
+  QUERY PLAN   
+---
  Update on public.rem1
-   Remote SQL: UPDATE public.loc1 SET f2 = $2 WHERE ctid = $1
+   Remote SQL: UPDATE public.loc1 SET f1 = $2, f2 = $3 WHERE ctid = $1
->  Foreign Scan on public.rem1
  Output: f1, ''::text, ctid, rem1.*
  Remote SQL: SELECT 

Re: BEFORE UPDATE trigger on postgres_fdw table not work

2019-05-27 Thread Shohei Mochizuki

On 2019/05/28 12:54, Amit Langote wrote:

On 2019/05/27 22:02, Tom Lane wrote:

Amit Langote  writes:

On 2019/05/27 10:52, Shohei Mochizuki wrote:

I noticed returning a modified record in a row-level BEFORE UPDATE trigger
on postgres_fdw foreign tables do not work. Attached patch fixes this issue.
This is because current fdw code adds only columns to RemoteSQL that were
explicitly targets of the UPDATE as follows.



Yeah.  So, the trigger execution correctly modifies the existing tuple
fetched from the remote server, but those changes are then essentially
discarded by postgres_fdw, that is, postgresExecForeignModify().



... Also, in the worst case, we'll end
up generating new query for every row being changed, because the trigger
may change different columns for different rows based on some condition.


Perhaps, if the table has relevant BEFORE triggers, we should just abandon
our attempts to optimize away fetching/storing all columns?  It seems like
another potential hazard here is a trigger needing to read a column that
is not mentioned in the SQL query.


The fetching side is fine, because rewriteTargetListUD() adds a
whole-row-var to the target list when the UPDATE / DELETE target is a
foreign table *and* there is a row trigger on the table.  postgres_fdw
sees that and constructs the query to fetch all columns.

So, the only problem here is the optimizing away of storing all columns,
which the Mochizuki-san's patch seems enough to fix.


Amit-san, Tom,
Thanks for the comments.

I checked other scenario. If a foreign table has AFTER trigger, remote update
query must return all columns and these cases are added at deparseReturningList
and covered by following existing test cases.

EXPLAIN (verbose, costs off)
UPDATE rem1 set f2 = '';  -- can't be pushed down
  QUERY PLAN
---
 Update on public.rem1
   Remote SQL: UPDATE public.loc1 SET f2 = $2 WHERE ctid = $1 RETURNING f1, f2
   ->  Foreign Scan on public.rem1
 Output: f1, ''::text, ctid, rem1.*
 Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
(5 rows)


Regards,

--
Shohei Mochizuki
TOSHIBA CORPORATION