(2018/03/23 20:55), Etsuro Fujita wrote:
(2018/03/23 4:09), Robert Haas wrote:
1. It still doesn't work for COPY, because COPY isn't going to have a
ModifyTableState. I really think it ought to be possible to come up
with an API that can handle both INSERT and COPY; I don't think it
should be necessary to have two different APIs for those two problems.
Amit managed to do it for regular tables, and I don't really see a
good reason why foreign tables need to be different.

Maybe I'm missing something, but I think the proposed FDW API could be
used for the COPY case as well with some modifications to core. If so,
my question is: should we support COPY into foreign tables as well? I
think that if we support COPY tuple routing for foreign partitions, it
would be better to support direct COPY into foreign partitions as well.

Done.

2. I previously asked why we couldn't use the existing APIs for this,
and you gave me some answer, but I can't help noticing that
postgresExecForeignRouting is an exact copy of
postgresExecForeignInsert. Apparently, some code reuse is indeed
possible here! Why not reuse the same function instead of calling a
new one? If the answer is that planSlot might be NULL in this case,
or something like that, then let's just document that. A needless
proliferation of FDW APIs is really undesirable, as it raises the bar
for every FDW author.

You've got a point! I'll change the patch that way.

Done. I modified the patch so that the existing API postgresExecForeignInsert is called as-is (ie, with the planSlot parameter) in the INSERT case and is called with that parameter set to NULL in the COPY case. So, I removed postgresExecForeignRouting and the postgres_fdw refactoring for that API. Also, I changed the names of the remaining new APIs to postgresBeginForeignInsert and postgresEndForeignInsert, which I think would be better because these are used not only for tuple routing but for directly copying into foreign tables. Also, I dropped partition_index from the parameter list for postgresBeginForeignInsert; I thought that it could be used for the FDW to access the partition info stored in mt_partition_tuple_routing for something in the case of tuple-routing, but I started to think that the FDW wouldn't need that in practice.

However, I think that getting INSERT
but not COPY, with bad performance, and with duplicated APIs, is
moving more in the wrong direction than the right one.

Will fix.

Done.

Attached is the new version of the patch. Patch foreign-routing-fdwapi-2.patch is created on top of patch postgres-fdw-refactoring-2.patch. (The former contains the bug-fix [1].) Any feedback is welcome!

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/5aba4074.1090...@lab.ntt.co.jp
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***************
*** 375,386 **** static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
--- 375,395 ----
  static void create_cursor(ForeignScanState *node);
  static void fetch_more_data(ForeignScanState *node);
  static void close_cursor(PGconn *conn, unsigned int cursor_number);
+ static PgFdwModifyState *create_foreign_modify(EState *estate,
+ 					  ResultRelInfo *resultRelInfo,
+ 					  CmdType operation,
+ 					  Plan *subplan,
+ 					  char *query,
+ 					  List *target_attrs,
+ 					  bool has_returning,
+ 					  List *retrieved_attrs);
  static void prepare_foreign_modify(PgFdwModifyState *fmstate);
  static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
  						 ItemPointer tupleid,
  						 TupleTableSlot *slot);
  static void store_returning_result(PgFdwModifyState *fmstate,
  					   TupleTableSlot *slot, PGresult *res);
+ static void finish_foreign_modify(PgFdwModifyState *fmstate);
  static List *build_remote_returning(Index rtindex, Relation rel,
  					   List *returningList);
  static void rebuild_fdw_scan_tlist(ForeignScan *fscan, List *tlist);
***************
*** 1678,1695 **** postgresBeginForeignModify(ModifyTableState *mtstate,
  						   int eflags)
  {
  	PgFdwModifyState *fmstate;
! 	EState	   *estate = mtstate->ps.state;
! 	CmdType		operation = mtstate->operation;
! 	Relation	rel = resultRelInfo->ri_RelationDesc;
! 	RangeTblEntry *rte;
! 	Oid			userid;
! 	ForeignTable *table;
! 	UserMapping *user;
! 	AttrNumber	n_params;
! 	Oid			typefnoid;
! 	bool		isvarlena;
! 	ListCell   *lc;
! 	TupleDesc	tupdesc = RelationGetDescr(rel);
  
  	/*
  	 * Do nothing in EXPLAIN (no ANALYZE) case.  resultRelInfo->ri_FdwState
--- 1687,1696 ----
  						   int eflags)
  {
  	PgFdwModifyState *fmstate;
! 	char	   *query;
! 	List	   *target_attrs;
! 	bool		has_returning;
! 	List	   *retrieved_attrs;
  
  	/*
  	 * Do nothing in EXPLAIN (no ANALYZE) case.  resultRelInfo->ri_FdwState
***************
*** 1698,1779 **** postgresBeginForeignModify(ModifyTableState *mtstate,
  	if (eflags & EXEC_FLAG_EXPLAIN_ONLY)
  		return;
  
- 	/* Begin constructing PgFdwModifyState. */
- 	fmstate = (PgFdwModifyState *) palloc0(sizeof(PgFdwModifyState));
- 	fmstate->rel = rel;
- 
- 	/*
- 	 * Identify which user to do the remote access as.  This should match what
- 	 * ExecCheckRTEPerms() does.
- 	 */
- 	rte = rt_fetch(resultRelInfo->ri_RangeTableIndex, estate->es_range_table);
- 	userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
- 
- 	/* Get info about foreign table. */
- 	table = GetForeignTable(RelationGetRelid(rel));
- 	user = GetUserMapping(userid, table->serverid);
- 
- 	/* Open connection; report that we'll create a prepared statement. */
- 	fmstate->conn = GetConnection(user, true);
- 	fmstate->p_name = NULL;		/* prepared statement not made yet */
- 
  	/* Deconstruct fdw_private data. */
! 	fmstate->query = strVal(list_nth(fdw_private,
! 									 FdwModifyPrivateUpdateSql));
! 	fmstate->target_attrs = (List *) list_nth(fdw_private,
! 											  FdwModifyPrivateTargetAttnums);
! 	fmstate->has_returning = intVal(list_nth(fdw_private,
! 											 FdwModifyPrivateHasReturning));
! 	fmstate->retrieved_attrs = (List *) list_nth(fdw_private,
! 												 FdwModifyPrivateRetrievedAttrs);
! 
! 	/* Create context for per-tuple temp workspace. */
! 	fmstate->temp_cxt = AllocSetContextCreate(estate->es_query_cxt,
! 											  "postgres_fdw temporary data",
! 											  ALLOCSET_SMALL_SIZES);
! 
! 	/* Prepare for input conversion of RETURNING results. */
! 	if (fmstate->has_returning)
! 		fmstate->attinmeta = TupleDescGetAttInMetadata(tupdesc);
! 
! 	/* Prepare for output conversion of parameters used in prepared stmt. */
! 	n_params = list_length(fmstate->target_attrs) + 1;
! 	fmstate->p_flinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * n_params);
! 	fmstate->p_nums = 0;
! 
! 	if (operation == CMD_UPDATE || operation == CMD_DELETE)
! 	{
! 		/* Find the ctid resjunk column in the subplan's result */
! 		Plan	   *subplan = mtstate->mt_plans[subplan_index]->plan;
! 
! 		fmstate->ctidAttno = ExecFindJunkAttributeInTlist(subplan->targetlist,
! 														  "ctid");
! 		if (!AttributeNumberIsValid(fmstate->ctidAttno))
! 			elog(ERROR, "could not find junk ctid column");
  
! 		/* First transmittable parameter will be ctid */
! 		getTypeOutputInfo(TIDOID, &typefnoid, &isvarlena);
! 		fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
! 		fmstate->p_nums++;
! 	}
! 
! 	if (operation == CMD_INSERT || operation == CMD_UPDATE)
! 	{
! 		/* Set up for remaining transmittable parameters */
! 		foreach(lc, fmstate->target_attrs)
! 		{
! 			int			attnum = lfirst_int(lc);
! 			Form_pg_attribute attr = TupleDescAttr(tupdesc, attnum - 1);
! 
! 			Assert(!attr->attisdropped);
! 
! 			getTypeOutputInfo(attr->atttypid, &typefnoid, &isvarlena);
! 			fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
! 			fmstate->p_nums++;
! 		}
! 	}
! 
! 	Assert(fmstate->p_nums <= n_params);
  
  	resultRelInfo->ri_FdwState = fmstate;
  }
--- 1699,1723 ----
  	if (eflags & EXEC_FLAG_EXPLAIN_ONLY)
  		return;
  
  	/* Deconstruct fdw_private data. */
! 	query = strVal(list_nth(fdw_private,
! 							FdwModifyPrivateUpdateSql));
! 	target_attrs = (List *) list_nth(fdw_private,
! 									 FdwModifyPrivateTargetAttnums);
! 	has_returning = intVal(list_nth(fdw_private,
! 									FdwModifyPrivateHasReturning));
! 	retrieved_attrs = (List *) list_nth(fdw_private,
! 										FdwModifyPrivateRetrievedAttrs);
  
! 	/* Construct an execution state. */
! 	fmstate = create_foreign_modify(mtstate->ps.state,
! 									resultRelInfo,
! 									mtstate->operation,
! 									mtstate->mt_plans[subplan_index]->plan,
! 									query,
! 									target_attrs,
! 									has_returning,
! 									retrieved_attrs);
  
  	resultRelInfo->ri_FdwState = fmstate;
  }
***************
*** 2008,2035 **** postgresEndForeignModify(EState *estate,
  	if (fmstate == NULL)
  		return;
  
! 	/* If we created a prepared statement, destroy it */
! 	if (fmstate->p_name)
! 	{
! 		char		sql[64];
! 		PGresult   *res;
! 
! 		snprintf(sql, sizeof(sql), "DEALLOCATE %s", fmstate->p_name);
! 
! 		/*
! 		 * We don't use a PG_TRY block here, so be careful not to throw error
! 		 * without releasing the PGresult.
! 		 */
! 		res = pgfdw_exec_query(fmstate->conn, sql);
! 		if (PQresultStatus(res) != PGRES_COMMAND_OK)
! 			pgfdw_report_error(ERROR, res, fmstate->conn, true, sql);
! 		PQclear(res);
! 		fmstate->p_name = NULL;
! 	}
! 
! 	/* Release remote connection */
! 	ReleaseConnection(fmstate->conn);
! 	fmstate->conn = NULL;
  }
  
  /*
--- 1952,1959 ----
  	if (fmstate == NULL)
  		return;
  
! 	/* Destroy the execution state. */
! 	finish_foreign_modify(fmstate);
  }
  
  /*
***************
*** 3217,3222 **** close_cursor(PGconn *conn, unsigned int cursor_number)
--- 3141,3249 ----
  }
  
  /*
+  * create_foreign_modify
+  *		Construct an execution state of a foreign insert/update/delete
+  *		operation.
+  */
+ static PgFdwModifyState *
+ create_foreign_modify(EState *estate,
+ 					  ResultRelInfo *resultRelInfo,
+ 					  CmdType operation,
+ 					  Plan *subplan,
+ 					  char *query,
+ 					  List *target_attrs,
+ 					  bool has_returning,
+ 					  List *retrieved_attrs)
+ {
+ 	PgFdwModifyState *fmstate;
+ 	Relation	rel = resultRelInfo->ri_RelationDesc;
+ 	RangeTblEntry *rte;
+ 	Oid			userid;
+ 	ForeignTable *table;
+ 	UserMapping *user;
+ 	AttrNumber	n_params;
+ 	Oid			typefnoid;
+ 	bool		isvarlena;
+ 	ListCell   *lc;
+ 	TupleDesc	tupdesc = RelationGetDescr(rel);
+ 
+ 	/* Begin constructing PgFdwModifyState. */
+ 	fmstate = (PgFdwModifyState *) palloc0(sizeof(PgFdwModifyState));
+ 	fmstate->rel = rel;
+ 
+ 	/*
+ 	 * Identify which user to do the remote access as.  This should match what
+ 	 * ExecCheckRTEPerms() does.
+ 	 */
+ 	rte = rt_fetch(resultRelInfo->ri_RangeTableIndex, estate->es_range_table);
+ 	userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
+ 
+ 	/* Get info about foreign table. */
+ 	table = GetForeignTable(RelationGetRelid(rel));
+ 	user = GetUserMapping(userid, table->serverid);
+ 
+ 	/* Open connection; report that we'll create a prepared statement. */
+ 	fmstate->conn = GetConnection(user, true);
+ 	fmstate->p_name = NULL;		/* prepared statement not made yet */
+ 
+ 	/* Set up remote query information. */
+ 	fmstate->query = query;
+ 	fmstate->target_attrs = target_attrs;
+ 	fmstate->has_returning = has_returning;
+ 	fmstate->retrieved_attrs = retrieved_attrs;
+ 
+ 	/* Create context for per-tuple temp workspace. */
+ 	fmstate->temp_cxt = AllocSetContextCreate(estate->es_query_cxt,
+ 											  "postgres_fdw temporary data",
+ 											  ALLOCSET_SMALL_SIZES);
+ 
+ 	/* Prepare for input conversion of RETURNING results. */
+ 	if (fmstate->has_returning)
+ 		fmstate->attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ 
+ 	/* Prepare for output conversion of parameters used in prepared stmt. */
+ 	n_params = list_length(fmstate->target_attrs) + 1;
+ 	fmstate->p_flinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * n_params);
+ 	fmstate->p_nums = 0;
+ 
+ 	if (operation == CMD_UPDATE || operation == CMD_DELETE)
+ 	{
+ 		Assert(subplan != NULL);
+ 
+ 		/* Find the ctid resjunk column in the subplan's result */
+ 		fmstate->ctidAttno = ExecFindJunkAttributeInTlist(subplan->targetlist,
+ 														  "ctid");
+ 		if (!AttributeNumberIsValid(fmstate->ctidAttno))
+ 			elog(ERROR, "could not find junk ctid column");
+ 
+ 		/* First transmittable parameter will be ctid */
+ 		getTypeOutputInfo(TIDOID, &typefnoid, &isvarlena);
+ 		fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
+ 		fmstate->p_nums++;
+ 	}
+ 
+ 	if (operation == CMD_INSERT || operation == CMD_UPDATE)
+ 	{
+ 		/* Set up for remaining transmittable parameters */
+ 		foreach(lc, fmstate->target_attrs)
+ 		{
+ 			int			attnum = lfirst_int(lc);
+ 			Form_pg_attribute attr = TupleDescAttr(tupdesc, attnum - 1);
+ 
+ 			Assert(!attr->attisdropped);
+ 
+ 			getTypeOutputInfo(attr->atttypid, &typefnoid, &isvarlena);
+ 			fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
+ 			fmstate->p_nums++;
+ 		}
+ 	}
+ 
+ 	Assert(fmstate->p_nums <= n_params);
+ 
+ 	return fmstate;
+ }
+ 
+ /*
   * prepare_foreign_modify
   *		Establish a prepared statement for execution of INSERT/UPDATE/DELETE
   */
***************
*** 3359,3364 **** store_returning_result(PgFdwModifyState *fmstate,
--- 3386,3424 ----
  }
  
  /*
+  * finish_foreign_modify
+  *		Release resources for a foreign insert/update/delete operation.
+  */
+ static void
+ finish_foreign_modify(PgFdwModifyState *fmstate)
+ {
+ 	Assert(fmstate != NULL);
+ 
+ 	/* If we created a prepared statement, destroy it */
+ 	if (fmstate->p_name)
+ 	{
+ 		char		sql[64];
+ 		PGresult   *res;
+ 
+ 		snprintf(sql, sizeof(sql), "DEALLOCATE %s", fmstate->p_name);
+ 
+ 		/*
+ 		 * We don't use a PG_TRY block here, so be careful not to throw error
+ 		 * without releasing the PGresult.
+ 		 */
+ 		res = pgfdw_exec_query(fmstate->conn, sql);
+ 		if (PQresultStatus(res) != PGRES_COMMAND_OK)
+ 			pgfdw_report_error(ERROR, res, fmstate->conn, true, sql);
+ 		PQclear(res);
+ 		fmstate->p_name = NULL;
+ 	}
+ 
+ 	/* Release remote connection */
+ 	ReleaseConnection(fmstate->conn);
+ 	fmstate->conn = NULL;
+ }
+ 
+ /*
   * build_remote_returning
   *		Build a RETURNING targetlist of a remote query for performing an
   *		UPDATE/DELETE .. RETURNING on a join directly
*** a/contrib/file_fdw/output/file_fdw.source
--- b/contrib/file_fdw/output/file_fdw.source
***************
*** 315,321 **** SELECT tableoid::regclass, * FROM p2;
  (0 rows)
  
  COPY pt FROM '@abs_srcdir@/data/list2.bad' with (format 'csv', delimiter ','); -- ERROR
! ERROR:  cannot route inserted tuples to a foreign table
  CONTEXT:  COPY pt, line 2: "1,qux"
  COPY pt FROM '@abs_srcdir@/data/list2.csv' with (format 'csv', delimiter ',');
  SELECT tableoid::regclass, * FROM pt;
--- 315,321 ----
  (0 rows)
  
  COPY pt FROM '@abs_srcdir@/data/list2.bad' with (format 'csv', delimiter ','); -- ERROR
! ERROR:  cannot insert into foreign table "p1"
  CONTEXT:  COPY pt, line 2: "1,qux"
  COPY pt FROM '@abs_srcdir@/data/list2.csv' with (format 'csv', delimiter ',');
  SELECT tableoid::regclass, * FROM pt;
***************
*** 342,351 **** SELECT tableoid::regclass, * FROM p2;
  (2 rows)
  
  INSERT INTO pt VALUES (1, 'xyzzy'); -- ERROR
! ERROR:  cannot route inserted tuples to a foreign table
  INSERT INTO pt VALUES (2, 'xyzzy');
  UPDATE pt set a = 1 where a = 2; -- ERROR
! ERROR:  cannot route inserted tuples to a foreign table
  SELECT tableoid::regclass, * FROM pt;
   tableoid | a |   b   
  ----------+---+-------
--- 342,351 ----
  (2 rows)
  
  INSERT INTO pt VALUES (1, 'xyzzy'); -- ERROR
! ERROR:  cannot insert into foreign table "p1"
  INSERT INTO pt VALUES (2, 'xyzzy');
  UPDATE pt set a = 1 where a = 2; -- ERROR
! ERROR:  cannot insert into foreign table "p1"
  SELECT tableoid::regclass, * FROM pt;
   tableoid | a |   b   
  ----------+---+-------
*** a/contrib/postgres_fdw/expected/postgres_fdw.out
--- b/contrib/postgres_fdw/expected/postgres_fdw.out
***************
*** 7371,7376 **** NOTICE:  drop cascades to foreign table bar2
--- 7371,7700 ----
  drop table loct1;
  drop table loct2;
  -- ===================================================================
+ -- test tuple routing for foreign-table partitions
+ -- ===================================================================
+ -- Test insert tuple routing
+ create table itrtest (a int, b int, c text) partition by list (a);
+ create table loctab1 (a int check (a in (1)), b int primary key, c text);
+ create foreign table remp1 (a int check (a in (1)), b int, c text) server loopback options (table_name 'loctab1');
+ create table loctab2 (b int primary key, c text, a int check (a in (2)));
+ create foreign table remp2 (b int, c text, a int check (a in (2))) server loopback options (table_name 'loctab2');
+ alter table itrtest attach partition remp1 for values in (1);
+ alter table itrtest attach partition remp2 for values in (2);
+ insert into itrtest values (1, 1, 'foo');
+ insert into itrtest values (1, 2, 'bar') returning *;
+  a | b |  c  
+ ---+---+-----
+  1 | 2 | bar
+ (1 row)
+ 
+ insert into itrtest values (2, 1, 'baz') returning *;
+  a | b |  c  
+ ---+---+-----
+  2 | 1 | baz
+ (1 row)
+ 
+ select tableoid::regclass, * FROM itrtest;
+  tableoid | a | b |  c  
+ ----------+---+---+-----
+  remp1    | 1 | 1 | foo
+  remp1    | 1 | 2 | bar
+  remp2    | 2 | 1 | baz
+ (3 rows)
+ 
+ select tableoid::regclass, * FROM remp1;
+  tableoid | a | b |  c  
+ ----------+---+---+-----
+  remp1    | 1 | 1 | foo
+  remp1    | 1 | 2 | bar
+ (2 rows)
+ 
+ select tableoid::regclass, * FROM remp2;
+  tableoid | b |  c  | a 
+ ----------+---+-----+---
+  remp2    | 1 | baz | 2
+ (1 row)
+ 
+ insert into itrtest values (2, 1, 'baz');
+ ERROR:  duplicate key value violates unique constraint "loctab2_pkey"
+ DETAIL:  Key (b)=(1) already exists.
+ CONTEXT:  remote SQL command: INSERT INTO public.loctab2(b, c, a) VALUES ($1, $2, $3)
+ insert into itrtest values (2, 1, 'baz') on conflict do nothing;
+ insert into itrtest values (2, 2, 'qux') on conflict do nothing returning *;
+  a | b |  c  
+ ---+---+-----
+  2 | 2 | qux
+ (1 row)
+ 
+ select tableoid::regclass, * FROM itrtest;
+  tableoid | a | b |  c  
+ ----------+---+---+-----
+  remp1    | 1 | 1 | foo
+  remp1    | 1 | 2 | bar
+  remp2    | 2 | 1 | baz
+  remp2    | 2 | 2 | qux
+ (4 rows)
+ 
+ select tableoid::regclass, * FROM remp1;
+  tableoid | a | b |  c  
+ ----------+---+---+-----
+  remp1    | 1 | 1 | foo
+  remp1    | 1 | 2 | bar
+ (2 rows)
+ 
+ select tableoid::regclass, * FROM remp2;
+  tableoid | b |  c  | a 
+ ----------+---+-----+---
+  remp2    | 1 | baz | 2
+  remp2    | 2 | qux | 2
+ (2 rows)
+ 
+ drop table itrtest;
+ drop table loctab1;
+ drop table loctab2;
+ -- Test update tuple routing
+ create table utrtest (a int, b int, c text) partition by list (a);
+ create table loctab (a int check (a in (1)), b int, c text);
+ create foreign table remp (a int check (a in (1)), b int, c text) server loopback options (table_name 'loctab');
+ create table locp (a int check (a in (2)), b int, c text);
+ alter table utrtest attach partition remp for values in (1);
+ alter table utrtest attach partition locp for values in (2);
+ insert into utrtest values (1, 1, 'foo');
+ insert into utrtest values (2, 2, 'qux');
+ select tableoid::regclass, * FROM utrtest;
+  tableoid | a | b |  c  
+ ----------+---+---+-----
+  remp     | 1 | 1 | foo
+  locp     | 2 | 2 | qux
+ (2 rows)
+ 
+ select tableoid::regclass, * FROM remp;
+  tableoid | a | b |  c  
+ ----------+---+---+-----
+  remp     | 1 | 1 | foo
+ (1 row)
+ 
+ select tableoid::regclass, * FROM locp;
+  tableoid | a | b |  c  
+ ----------+---+---+-----
+  locp     | 2 | 2 | qux
+ (1 row)
+ 
+ -- It's not allowed to move a row from a partition that is foreign to another
+ update utrtest set a = 2 where c = 'foo' returning *;
+ ERROR:  new row for relation "loctab" violates check constraint "loctab_a_check"
+ DETAIL:  Failing row contains (2, 1, foo).
+ CONTEXT:  remote SQL command: UPDATE public.loctab SET a = 2 WHERE ((c = 'foo'::text)) RETURNING a, b, c
+ -- But the reverse is allowed
+ update utrtest set a = 1 where c = 'qux' returning *;
+  a | b |  c  
+ ---+---+-----
+  1 | 2 | qux
+ (1 row)
+ 
+ select tableoid::regclass, * FROM utrtest;
+  tableoid | a | b |  c  
+ ----------+---+---+-----
+  remp     | 1 | 1 | foo
+  remp     | 1 | 2 | qux
+ (2 rows)
+ 
+ select tableoid::regclass, * FROM remp;
+  tableoid | a | b |  c  
+ ----------+---+---+-----
+  remp     | 1 | 1 | foo
+  remp     | 1 | 2 | qux
+ (2 rows)
+ 
+ select tableoid::regclass, * FROM locp;
+  tableoid | a | b | c 
+ ----------+---+---+---
+ (0 rows)
+ 
+ -- Test that the executor doesn't let unexercised FDWs shut down
+ update utrtest set a = 1 where c = 'foo';
+ drop table utrtest;
+ drop table loctab;
+ -- Test copy tuple routing
+ create table ctrtest (a int, b int, c text) partition by list (a);
+ create table loctab1 (a int check (a in (1)), b int, c text);
+ create foreign table remp1 (a int check (a in (1)), b int, c text) server loopback options (table_name 'loctab1');
+ create table loctab2 (b int, c text, a int check (a in (2)));
+ create foreign table remp2 (b int, c text, a int check (a in (2))) server loopback options (table_name 'loctab2');
+ alter table ctrtest attach partition remp1 for values in (1);
+ alter table ctrtest attach partition remp2 for values in (2);
+ copy ctrtest from stdin;
+ select tableoid::regclass, * FROM ctrtest;
+  tableoid | a | b |  c  
+ ----------+---+---+-----
+  remp1    | 1 | 1 | foo
+  remp2    | 2 | 2 | qux
+ (2 rows)
+ 
+ select tableoid::regclass, * FROM remp1;
+  tableoid | a | b |  c  
+ ----------+---+---+-----
+  remp1    | 1 | 1 | foo
+ (1 row)
+ 
+ select tableoid::regclass, * FROM remp2;
+  tableoid | b |  c  | a 
+ ----------+---+-----+---
+  remp2    | 2 | qux | 2
+ (1 row)
+ 
+ -- Copying into foreign partitions directly works as well
+ copy remp1 from stdin;
+ select tableoid::regclass, * FROM remp1;
+  tableoid | a | b |  c  
+ ----------+---+---+-----
+  remp1    | 1 | 1 | foo
+  remp1    | 1 | 2 | bar
+ (2 rows)
+ 
+ drop table ctrtest;
+ drop table loctab1;
+ drop table loctab2;
+ -- ===================================================================
+ -- test COPY FROM
+ -- ===================================================================
+ create table loc2 (f1 int, f2 text);
+ alter table loc2 set (autovacuum_enabled = 'false');
+ create foreign table rem2 (f1 int, f2 text) server loopback options(table_name 'loc2');
+ -- Test basic functionality
+ copy rem2 from stdin;
+ select * from rem2;
+  f1 | f2  
+ ----+-----
+   1 | foo
+   2 | bar
+ (2 rows)
+ 
+ delete from rem2;
+ -- Test check constraints
+ alter table loc2 add constraint loc2_f1positive check (f1 >= 0);
+ alter foreign table rem2 add constraint rem2_f1positive check (f1 >= 0);
+ copy rem2 from stdin; -- fail on remote side
+ ERROR:  new row for relation "loc2" violates check constraint "loc2_f1positive"
+ DETAIL:  Failing row contains (-1, xyzzy).
+ CONTEXT:  remote SQL command: INSERT INTO public.loc2(f1, f2) VALUES ($1, $2)
+ COPY rem2, line 1: "-1	xyzzy"
+ select * from rem2;
+  f1 | f2 
+ ----+----
+ (0 rows)
+ 
+ alter foreign table rem2 drop constraint rem2_f1positive;
+ alter table loc2 drop constraint loc2_f1positive;
+ -- Test local triggers
+ create trigger trig_stmt_before before insert on rem2
+ 	for each statement execute procedure trigger_func();
+ create trigger trig_stmt_after after insert on rem2
+ 	for each statement execute procedure trigger_func();
+ create trigger trig_row_before before insert on rem2
+ 	for each row execute procedure trigger_data(23,'skidoo');
+ create trigger trig_row_after after insert on rem2
+ 	for each row execute procedure trigger_data(23,'skidoo');
+ copy rem2 from stdin;
+ NOTICE:  trigger_func(<NULL>) called: action = INSERT, when = BEFORE, level = STATEMENT
+ NOTICE:  trig_row_before(23, skidoo) BEFORE ROW INSERT ON rem2
+ NOTICE:  NEW: (1,foo)
+ NOTICE:  trig_row_before(23, skidoo) BEFORE ROW INSERT ON rem2
+ NOTICE:  NEW: (2,bar)
+ NOTICE:  trig_row_after(23, skidoo) AFTER ROW INSERT ON rem2
+ NOTICE:  NEW: (1,foo)
+ NOTICE:  trig_row_after(23, skidoo) AFTER ROW INSERT ON rem2
+ NOTICE:  NEW: (2,bar)
+ NOTICE:  trigger_func(<NULL>) called: action = INSERT, when = AFTER, level = STATEMENT
+ select * from rem2;
+  f1 | f2  
+ ----+-----
+   1 | foo
+   2 | bar
+ (2 rows)
+ 
+ drop trigger trig_row_before on rem2;
+ drop trigger trig_row_after on rem2;
+ drop trigger trig_stmt_before on rem2;
+ drop trigger trig_stmt_after on rem2;
+ delete from rem2;
+ create trigger trig_row_before_insupdate before insert on rem2
+ 	for each row execute procedure trig_row_before_insupdate();
+ -- The new values are concatenated with ' triggered !'
+ copy rem2 from stdin;
+ select * from rem2;
+  f1 |       f2        
+ ----+-----------------
+   1 | foo triggered !
+   2 | bar triggered !
+ (2 rows)
+ 
+ drop trigger trig_row_before_insupdate on rem2;
+ delete from rem2;
+ create trigger trig_null before insert on rem2
+ 	for each row execute procedure trig_null();
+ -- Nothing happens
+ copy rem2 from stdin;
+ select * from rem2;
+  f1 | f2 
+ ----+----
+ (0 rows)
+ 
+ drop trigger trig_null on rem2;
+ delete from rem2;
+ -- Test remote triggers
+ create trigger trig_row_before_insupdate before insert on loc2
+ 	for each row execute procedure trig_row_before_insupdate();
+ -- The new values are concatenated with ' triggered !'
+ copy rem2 from stdin;
+ select * from rem2;
+  f1 |       f2        
+ ----+-----------------
+   1 | foo triggered !
+   2 | bar triggered !
+ (2 rows)
+ 
+ drop trigger trig_row_before_insupdate on loc2;
+ delete from rem2;
+ create trigger trig_null before insert on loc2
+ 	for each row execute procedure trig_null();
+ -- Nothing happens
+ copy rem2 from stdin;
+ select * from rem2;
+  f1 | f2 
+ ----+----
+ (0 rows)
+ 
+ drop trigger trig_null on loc2;
+ delete from rem2;
+ -- Test a combination of local and remote triggers
+ create trigger rem2_trig_row_before before insert on rem2
+ 	for each row execute procedure trigger_data(23,'skidoo');
+ create trigger rem2_trig_row_after after insert on rem2
+ 	for each row execute procedure trigger_data(23,'skidoo');
+ create trigger loc2_trig_row_before_insupdate before insert on loc2
+ 	for each row execute procedure trig_row_before_insupdate();
+ copy rem2 from stdin;
+ NOTICE:  rem2_trig_row_before(23, skidoo) BEFORE ROW INSERT ON rem2
+ NOTICE:  NEW: (1,foo)
+ NOTICE:  rem2_trig_row_before(23, skidoo) BEFORE ROW INSERT ON rem2
+ NOTICE:  NEW: (2,bar)
+ NOTICE:  rem2_trig_row_after(23, skidoo) AFTER ROW INSERT ON rem2
+ NOTICE:  NEW: (1,"foo triggered !")
+ NOTICE:  rem2_trig_row_after(23, skidoo) AFTER ROW INSERT ON rem2
+ NOTICE:  NEW: (2,"bar triggered !")
+ select * from rem2;
+  f1 |       f2        
+ ----+-----------------
+   1 | foo triggered !
+   2 | bar triggered !
+ (2 rows)
+ 
+ drop trigger rem2_trig_row_before on rem2;
+ drop trigger rem2_trig_row_after on rem2;
+ drop trigger loc2_trig_row_before_insupdate on loc2;
+ delete from rem2;
+ -- ===================================================================
  -- test IMPORT FOREIGN SCHEMA
  -- ===================================================================
  CREATE SCHEMA import_source;
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***************
*** 319,324 **** static TupleTableSlot *postgresExecForeignDelete(EState *estate,
--- 319,328 ----
  						  TupleTableSlot *planSlot);
  static void postgresEndForeignModify(EState *estate,
  						 ResultRelInfo *resultRelInfo);
+ static void postgresBeginForeignInsert(ModifyTableState *mtstate,
+ 						   ResultRelInfo *resultRelInfo);
+ static void postgresEndForeignInsert(EState *estate,
+ 						 ResultRelInfo *resultRelInfo);
  static int	postgresIsForeignRelUpdatable(Relation rel);
  static bool postgresPlanDirectModify(PlannerInfo *root,
  						 ModifyTable *plan,
***************
*** 470,475 **** postgres_fdw_handler(PG_FUNCTION_ARGS)
--- 474,481 ----
  	routine->ExecForeignUpdate = postgresExecForeignUpdate;
  	routine->ExecForeignDelete = postgresExecForeignDelete;
  	routine->EndForeignModify = postgresEndForeignModify;
+ 	routine->BeginForeignInsert = postgresBeginForeignInsert;
+ 	routine->EndForeignInsert = postgresEndForeignInsert;
  	routine->IsForeignRelUpdatable = postgresIsForeignRelUpdatable;
  	routine->PlanDirectModify = postgresPlanDirectModify;
  	routine->BeginDirectModify = postgresBeginDirectModify;
***************
*** 1957,1962 **** postgresEndForeignModify(EState *estate,
--- 1963,2058 ----
  }
  
  /*
+  * postgresBeginForeignInsert
+  *		Begin an insert operation on a foreign table
+  */
+ static void
+ postgresBeginForeignInsert(ModifyTableState *mtstate,
+ 						   ResultRelInfo *resultRelInfo)
+ {
+ 	PgFdwModifyState *fmstate;
+ 	ModifyTable	*plan = (ModifyTable *) mtstate->ps.plan;
+ 	Relation	rel = resultRelInfo->ri_RelationDesc;
+ 	RangeTblEntry *rte;
+ 	Query	   *query;
+ 	PlannerInfo *root;
+ 	TupleDesc	tupdesc = RelationGetDescr(rel);
+ 	int			attnum;
+ 	StringInfoData sql;
+ 	List	   *targetAttrs = NIL;
+ 	List	   *retrieved_attrs = NIL;
+ 	bool		doNothing = false;
+ 
+ 	initStringInfo(&sql);
+ 
+ 	/* Set up largely-dummy planner state */
+ 	rte = makeNode(RangeTblEntry);
+ 	rte->rtekind = RTE_RELATION;
+ 	rte->relid = RelationGetRelid(rel);
+ 	rte->relkind = RELKIND_FOREIGN_TABLE;
+ 	query = makeNode(Query);
+ 	query->commandType = CMD_INSERT;
+ 	query->resultRelation = 1;
+ 	query->rtable = list_make1(rte);
+ 	root = makeNode(PlannerInfo);
+ 	root->parse = query;
+ 
+ 	/* We transmit all columns that are defined in the foreign table. */
+ 	for (attnum = 1; attnum <= tupdesc->natts; attnum++)
+ 	{
+ 		Form_pg_attribute attr = TupleDescAttr(tupdesc, attnum - 1);
+ 
+ 		if (!attr->attisdropped)
+ 			targetAttrs = lappend_int(targetAttrs, attnum);
+ 	}
+ 
+ 	/* Check if we add the ON CONFLICT clause to the remote query. */
+ 	if (plan)
+ 	{
+ 		OnConflictAction onConflictAction = plan->onConflictAction;
+ 
+ 		/* We only support DO NOTHING without an inference specification. */
+ 		if (onConflictAction == ONCONFLICT_NOTHING)
+ 			doNothing = true;
+ 		else if (onConflictAction != ONCONFLICT_NONE)
+ 			elog(ERROR, "unexpected ON CONFLICT specification: %d",
+ 				 (int) onConflictAction);
+ 	}
+ 
+ 	/* Construct the SQL command string. */
+ 	deparseInsertSql(&sql, root, 1, rel, targetAttrs, doNothing,
+ 					 resultRelInfo->ri_returningList, &retrieved_attrs);
+ 
+ 	/* Construct an execution state. */
+ 	fmstate = create_foreign_modify(mtstate->ps.state,
+ 									resultRelInfo,
+ 									CMD_INSERT,
+ 									NULL,
+ 									sql.data,
+ 									targetAttrs,
+ 									retrieved_attrs != NIL,
+ 									retrieved_attrs);
+ 
+ 	resultRelInfo->ri_FdwState = fmstate;
+ }
+ 
+ /*
+  * postgresEndForeignInsert
+  *		Finish an insert operation on a foreign table
+  */
+ static void
+ postgresEndForeignInsert(EState *estate,
+ 						 ResultRelInfo *resultRelInfo)
+ {
+ 	PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
+ 
+ 	Assert(fmstate != NULL);
+ 
+ 	/* Destroy the execution state. */
+ 	finish_foreign_modify(fmstate);
+ }
+ 
+ /*
   * postgresIsForeignRelUpdatable
   *		Determine whether a foreign table supports INSERT, UPDATE and/or
   *		DELETE.
*** a/contrib/postgres_fdw/sql/postgres_fdw.sql
--- b/contrib/postgres_fdw/sql/postgres_fdw.sql
***************
*** 1768,1773 **** drop table loct1;
--- 1768,1995 ----
  drop table loct2;
  
  -- ===================================================================
+ -- test tuple routing for foreign-table partitions
+ -- ===================================================================
+ 
+ -- Test insert tuple routing
+ create table itrtest (a int, b int, c text) partition by list (a);
+ create table loctab1 (a int check (a in (1)), b int primary key, c text);
+ create foreign table remp1 (a int check (a in (1)), b int, c text) server loopback options (table_name 'loctab1');
+ create table loctab2 (b int primary key, c text, a int check (a in (2)));
+ create foreign table remp2 (b int, c text, a int check (a in (2))) server loopback options (table_name 'loctab2');
+ alter table itrtest attach partition remp1 for values in (1);
+ alter table itrtest attach partition remp2 for values in (2);
+ 
+ insert into itrtest values (1, 1, 'foo');
+ insert into itrtest values (1, 2, 'bar') returning *;
+ insert into itrtest values (2, 1, 'baz') returning *;
+ 
+ select tableoid::regclass, * FROM itrtest;
+ select tableoid::regclass, * FROM remp1;
+ select tableoid::regclass, * FROM remp2;
+ 
+ insert into itrtest values (2, 1, 'baz');
+ insert into itrtest values (2, 1, 'baz') on conflict do nothing;
+ insert into itrtest values (2, 2, 'qux') on conflict do nothing returning *;
+ 
+ select tableoid::regclass, * FROM itrtest;
+ select tableoid::regclass, * FROM remp1;
+ select tableoid::regclass, * FROM remp2;
+ 
+ drop table itrtest;
+ drop table loctab1;
+ drop table loctab2;
+ 
+ -- Test update tuple routing
+ create table utrtest (a int, b int, c text) partition by list (a);
+ create table loctab (a int check (a in (1)), b int, c text);
+ create foreign table remp (a int check (a in (1)), b int, c text) server loopback options (table_name 'loctab');
+ create table locp (a int check (a in (2)), b int, c text);
+ alter table utrtest attach partition remp for values in (1);
+ alter table utrtest attach partition locp for values in (2);
+ 
+ insert into utrtest values (1, 1, 'foo');
+ insert into utrtest values (2, 2, 'qux');
+ 
+ select tableoid::regclass, * FROM utrtest;
+ select tableoid::regclass, * FROM remp;
+ select tableoid::regclass, * FROM locp;
+ 
+ -- It's not allowed to move a row from a partition that is foreign to another
+ update utrtest set a = 2 where c = 'foo' returning *;
+ 
+ -- But the reverse is allowed
+ update utrtest set a = 1 where c = 'qux' returning *;
+ 
+ select tableoid::regclass, * FROM utrtest;
+ select tableoid::regclass, * FROM remp;
+ select tableoid::regclass, * FROM locp;
+ 
+ -- Test that the executor doesn't let unexercised FDWs shut down
+ update utrtest set a = 1 where c = 'foo';
+ 
+ drop table utrtest;
+ drop table loctab;
+ 
+ -- Test copy tuple routing
+ create table ctrtest (a int, b int, c text) partition by list (a);
+ create table loctab1 (a int check (a in (1)), b int, c text);
+ create foreign table remp1 (a int check (a in (1)), b int, c text) server loopback options (table_name 'loctab1');
+ create table loctab2 (b int, c text, a int check (a in (2)));
+ create foreign table remp2 (b int, c text, a int check (a in (2))) server loopback options (table_name 'loctab2');
+ alter table ctrtest attach partition remp1 for values in (1);
+ alter table ctrtest attach partition remp2 for values in (2);
+ 
+ copy ctrtest from stdin;
+ 1	1	foo
+ 2	2	qux
+ \.
+ 
+ select tableoid::regclass, * FROM ctrtest;
+ select tableoid::regclass, * FROM remp1;
+ select tableoid::regclass, * FROM remp2;
+ 
+ -- Copying into foreign partitions directly works as well
+ copy remp1 from stdin;
+ 1	2	bar
+ \.
+ 
+ select tableoid::regclass, * FROM remp1;
+ 
+ drop table ctrtest;
+ drop table loctab1;
+ drop table loctab2;
+ 
+ -- ===================================================================
+ -- test COPY FROM
+ -- ===================================================================
+ 
+ create table loc2 (f1 int, f2 text);
+ alter table loc2 set (autovacuum_enabled = 'false');
+ create foreign table rem2 (f1 int, f2 text) server loopback options(table_name 'loc2');
+ 
+ -- Test basic functionality
+ copy rem2 from stdin;
+ 1	foo
+ 2	bar
+ \.
+ select * from rem2;
+ 
+ delete from rem2;
+ 
+ -- Test check constraints
+ alter table loc2 add constraint loc2_f1positive check (f1 >= 0);
+ alter foreign table rem2 add constraint rem2_f1positive check (f1 >= 0);
+ 
+ copy rem2 from stdin; -- fail on remote side
+ -1	xyzzy
+ \.
+ select * from rem2;
+ 
+ alter foreign table rem2 drop constraint rem2_f1positive;
+ alter table loc2 drop constraint loc2_f1positive;
+ 
+ -- Test local triggers
+ create trigger trig_stmt_before before insert on rem2
+ 	for each statement execute procedure trigger_func();
+ create trigger trig_stmt_after after insert on rem2
+ 	for each statement execute procedure trigger_func();
+ create trigger trig_row_before before insert on rem2
+ 	for each row execute procedure trigger_data(23,'skidoo');
+ create trigger trig_row_after after insert on rem2
+ 	for each row execute procedure trigger_data(23,'skidoo');
+ 
+ copy rem2 from stdin;
+ 1	foo
+ 2	bar
+ \.
+ select * from rem2;
+ 
+ drop trigger trig_row_before on rem2;
+ drop trigger trig_row_after on rem2;
+ drop trigger trig_stmt_before on rem2;
+ drop trigger trig_stmt_after on rem2;
+ 
+ delete from rem2;
+ 
+ create trigger trig_row_before_insupdate before insert on rem2
+ 	for each row execute procedure trig_row_before_insupdate();
+ 
+ -- The new values are concatenated with ' triggered !'
+ copy rem2 from stdin;
+ 1	foo
+ 2	bar
+ \.
+ select * from rem2;
+ 
+ drop trigger trig_row_before_insupdate on rem2;
+ 
+ delete from rem2;
+ 
+ create trigger trig_null before insert on rem2
+ 	for each row execute procedure trig_null();
+ 
+ -- Nothing happens
+ copy rem2 from stdin;
+ 1	foo
+ 2	bar
+ \.
+ select * from rem2;
+ 
+ drop trigger trig_null on rem2;
+ 
+ delete from rem2;
+ 
+ -- Test remote triggers
+ create trigger trig_row_before_insupdate before insert on loc2
+ 	for each row execute procedure trig_row_before_insupdate();
+ 
+ -- The new values are concatenated with ' triggered !'
+ copy rem2 from stdin;
+ 1	foo
+ 2	bar
+ \.
+ select * from rem2;
+ 
+ drop trigger trig_row_before_insupdate on loc2;
+ 
+ delete from rem2;
+ 
+ create trigger trig_null before insert on loc2
+ 	for each row execute procedure trig_null();
+ 
+ -- Nothing happens
+ copy rem2 from stdin;
+ 1	foo
+ 2	bar
+ \.
+ select * from rem2;
+ 
+ drop trigger trig_null on loc2;
+ 
+ delete from rem2;
+ 
+ -- Test a combination of local and remote triggers
+ create trigger rem2_trig_row_before before insert on rem2
+ 	for each row execute procedure trigger_data(23,'skidoo');
+ create trigger rem2_trig_row_after after insert on rem2
+ 	for each row execute procedure trigger_data(23,'skidoo');
+ create trigger loc2_trig_row_before_insupdate before insert on loc2
+ 	for each row execute procedure trig_row_before_insupdate();
+ 
+ copy rem2 from stdin;
+ 1	foo
+ 2	bar
+ \.
+ select * from rem2;
+ 
+ drop trigger rem2_trig_row_before on rem2;
+ drop trigger rem2_trig_row_after on rem2;
+ drop trigger loc2_trig_row_before_insupdate on loc2;
+ 
+ delete from rem2;
+ 
+ -- ===================================================================
  -- test IMPORT FOREIGN SCHEMA
  -- ===================================================================
  
*** a/doc/src/sgml/ddl.sgml
--- b/doc/src/sgml/ddl.sgml
***************
*** 3037,3047 **** VALUES ('Albany', NULL, NULL, 'NY');
     </para>
  
     <para>
!     Partitions can also be foreign tables
!     (see <xref linkend="sql-createforeigntable"/>),
!     although these have some limitations that normal tables do not.  For
!     example, data inserted into the partitioned table is not routed to
!     foreign table partitions.
     </para>
  
     <para>
--- 3037,3045 ----
     </para>
  
     <para>
!     Partitions can also be foreign tables, although they have some limitations
!     that normal tables do not; see <xref linkend="sql-createforeigntable"> for
!     more information.
     </para>
  
     <para>
*** a/doc/src/sgml/fdwhandler.sgml
--- b/doc/src/sgml/fdwhandler.sgml
***************
*** 690,695 **** EndForeignModify(EState *estate,
--- 690,759 ----
      </para>
  
      <para>
+      Tuples inserted into a partitioned table are routed to partitions.  If an
+      FDW supports routable foreign-table partitions, it should also provide
+      the following callback functions.  These functions are also called when
+      <command>COPY FROM</command> is executed on a foreign table.
+     </para>
+ 
+     <para>
+ <programlisting>
+ void
+ BeginForeignInsert(ModifyTableState *mtstate,
+                    ResultRelInfo *rinfo);
+ </programlisting>
+ 
+      Begin executing an insert operation on a foreign table.  This routine is
+      called right before the first tuple is inserted into the foreign table
+      in both cases where it is the partition chosen for tuple routing and the
+      target specified in a <command>COPY FROM</command> command.  It should
+      perform any initialization needed prior to the actual insertion.
+      Subsequently, <function>ExecForeignInsert</function> will be called for
+      each tuple to be inserted into the foreign table.
+     </para>
+ 
+     <para>
+      <literal>mtstate</literal> is the overall state of the
+      <structname>ModifyTable</structname> plan node being executed; global data about
+      the plan and execution state is available via this structure.
+      <literal>rinfo</literal> is the <structname>ResultRelInfo</structname> struct describing
+      the target foreign table.  (The <structfield>ri_FdwState</structfield> field of
+      <structname>ResultRelInfo</structname> is available for the FDW to store any
+      private state it needs for this operation.)
+     </para>
+ 
+     <para>
+      When this is called by a <command>COPY FROM</command> command, the
+      plan-related global data in <literal>mtstate</literal> is not provided
+      and the <literal>planSlot</literal> parameter of
+      <function>ExecForeignInsert</function> called for each inserted tuple is
+      <literal>NULL</literal>, wether the foreign table is the partition chosen
+      for tuple routing or the target specified in the command.
+     </para>
+ 
+     <para>
+      If the <function>BeginForeignInsert</function> pointer is set to
+      <literal>NULL</literal>, no action is taken for the initialization.
+     </para>
+ 
+     <para>
+ <programlisting>
+ void
+ EndForeignInsert(EState *estate,
+                  ResultRelInfo *rinfo);
+ </programlisting>
+ 
+      End the insert operation and release resources.  It is normally not important
+      to release palloc'd memory, but for example open files and connections
+      to remote servers should be cleaned up.
+     </para>
+ 
+     <para>
+      If the <function>EndForeignInsert</function> pointer is set to
+      <literal>NULL</literal>, no action is taken for the termination.
+     </para>
+ 
+     <para>
  <programlisting>
  int
  IsForeignRelUpdatable(Relation rel);
*** a/doc/src/sgml/ref/copy.sgml
--- b/doc/src/sgml/ref/copy.sgml
***************
*** 402,409 **** COPY <replaceable class="parameter">count</replaceable>
     </para>
  
     <para>
!     <command>COPY FROM</command> can be used with plain tables and with views
!     that have <literal>INSTEAD OF INSERT</literal> triggers.
     </para>
  
     <para>
--- 402,410 ----
     </para>
  
     <para>
!     <command>COPY FROM</command> can be used with plain, foreign, or
!     partitioned tables and with views that have
!     <literal>INSTEAD OF INSERT</literal> triggers.
     </para>
  
     <para>
*** a/src/backend/commands/copy.c
--- b/src/backend/commands/copy.c
***************
*** 29,34 ****
--- 29,35 ----
  #include "commands/trigger.h"
  #include "executor/execPartition.h"
  #include "executor/executor.h"
+ #include "foreign/fdwapi.h"
  #include "libpq/libpq.h"
  #include "libpq/pqformat.h"
  #include "mb/pg_wchar.h"
***************
*** 2284,2289 **** CopyFrom(CopyState cstate)
--- 2285,2291 ----
  	ResultRelInfo *resultRelInfo;
  	ResultRelInfo *saved_resultRelInfo = NULL;
  	EState	   *estate = CreateExecutorState(); /* for ExecConstraints() */
+ 	ModifyTableState *mtstate;
  	ExprContext *econtext;
  	TupleTableSlot *myslot;
  	MemoryContext oldcontext = CurrentMemoryContext;
***************
*** 2305,2315 **** CopyFrom(CopyState cstate)
  	Assert(cstate->rel);
  
  	/*
! 	 * The target must be a plain relation or have an INSTEAD OF INSERT row
! 	 * trigger.  (Currently, such triggers are only allowed on views, so we
! 	 * only hint about them in the view case.)
  	 */
  	if (cstate->rel->rd_rel->relkind != RELKIND_RELATION &&
  		cstate->rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE &&
  		!(cstate->rel->trigdesc &&
  		  cstate->rel->trigdesc->trig_insert_instead_row))
--- 2307,2318 ----
  	Assert(cstate->rel);
  
  	/*
! 	 * The target must be a plain, foreign, or partitioned relation, or have
! 	 * an INSTEAD OF INSERT row trigger.  (Currently, such triggers are only
! 	 * allowed on views, so we only hint about them in the view case.)
  	 */
  	if (cstate->rel->rd_rel->relkind != RELKIND_RELATION &&
+ 		cstate->rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE &&
  		cstate->rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE &&
  		!(cstate->rel->trigdesc &&
  		  cstate->rel->trigdesc->trig_insert_instead_row))
***************
*** 2325,2335 **** CopyFrom(CopyState cstate)
  					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
  					 errmsg("cannot copy to materialized view \"%s\"",
  							RelationGetRelationName(cstate->rel))));
- 		else if (cstate->rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
- 			ereport(ERROR,
- 					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
- 					 errmsg("cannot copy to foreign table \"%s\"",
- 							RelationGetRelationName(cstate->rel))));
  		else if (cstate->rel->rd_rel->relkind == RELKIND_SEQUENCE)
  			ereport(ERROR,
  					(errcode(ERRCODE_WRONG_OBJECT_TYPE),
--- 2328,2333 ----
***************
*** 2448,2453 **** CopyFrom(CopyState cstate)
--- 2446,2466 ----
  	/* Triggers might need a slot as well */
  	estate->es_trig_tuple_slot = ExecInitExtraTupleSlot(estate, NULL);
  
+ 	/*
+ 	 * Set up a ModifyTableState so we can let FDW(s) init themselves for
+ 	 * foreign-table result relation(s).
+ 	 */
+ 	mtstate = makeNode(ModifyTableState);
+ 	mtstate->ps.plan = NULL;
+ 	mtstate->ps.state = estate;
+ 	mtstate->operation = CMD_INSERT;
+ 	mtstate->resultRelInfo = estate->es_result_relations;
+ 
+ 	if (resultRelInfo->ri_FdwRoutine != NULL &&
+ 		resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
+ 		resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate,
+ 														 resultRelInfo);
+ 
  	/* Prepare to catch AFTER triggers. */
  	AfterTriggerBeginQuery();
  
***************
*** 2489,2499 **** CopyFrom(CopyState cstate)
  	 * expressions. Such triggers or expressions might query the table we're
  	 * inserting to, and act differently if the tuples that have already been
  	 * processed and prepared for insertion are not there.  We also can't do
! 	 * it if the table is partitioned.
  	 */
  	if ((resultRelInfo->ri_TrigDesc != NULL &&
  		 (resultRelInfo->ri_TrigDesc->trig_insert_before_row ||
  		  resultRelInfo->ri_TrigDesc->trig_insert_instead_row)) ||
  		cstate->partition_tuple_routing != NULL ||
  		cstate->volatile_defexprs)
  	{
--- 2502,2513 ----
  	 * expressions. Such triggers or expressions might query the table we're
  	 * inserting to, and act differently if the tuples that have already been
  	 * processed and prepared for insertion are not there.  We also can't do
! 	 * it if the table is foreign or partitioned.
  	 */
  	if ((resultRelInfo->ri_TrigDesc != NULL &&
  		 (resultRelInfo->ri_TrigDesc->trig_insert_before_row ||
  		  resultRelInfo->ri_TrigDesc->trig_insert_instead_row)) ||
+ 		resultRelInfo->ri_FdwRoutine != NULL ||
  		cstate->partition_tuple_routing != NULL ||
  		cstate->volatile_defexprs)
  	{
***************
*** 2615,2625 **** CopyFrom(CopyState cstate)
  				Assert(resultRelInfo != NULL);
  			}
  
! 			/* We do not yet have a way to insert into a foreign partition */
! 			if (resultRelInfo->ri_FdwRoutine)
! 				ereport(ERROR,
! 						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
! 						 errmsg("cannot route inserted tuples to a foreign table")));
  
  			/*
  			 * For ExecInsertIndexTuples() to work on the partition's indexes
--- 2629,2647 ----
  				Assert(resultRelInfo != NULL);
  			}
  
! 			/*
! 			 * Verify the specified partition is a valid target for INSERT if
! 			 * we didn't yet.
! 			 */
! 			if (!resultRelInfo->ri_PartitionIsValid)
! 			{
! 				CheckValidResultRel(resultRelInfo, CMD_INSERT);
! 
! 				resultRelInfo->ri_PartitionIsValid = true;
! 
! 				/* Let the FDW init itself for tuple routing. */
! 				ExecInitForeignRouting(mtstate, estate, resultRelInfo);
! 			}
  
  			/*
  			 * For ExecInsertIndexTuples() to work on the partition's indexes
***************
*** 2708,2715 **** CopyFrom(CopyState cstate)
  					  resultRelInfo->ri_TrigDesc->trig_insert_before_row))
  					check_partition_constr = false;
  
! 				/* Check the constraints of the tuple */
! 				if (cstate->rel->rd_att->constr || check_partition_constr)
  					ExecConstraints(resultRelInfo, slot, estate, true);
  
  				if (useHeapMultiInsert)
--- 2730,2742 ----
  					  resultRelInfo->ri_TrigDesc->trig_insert_before_row))
  					check_partition_constr = false;
  
! 				/*
! 				 * If the target is a plain table, check the constraints of
! 				 * the tuple.
! 				 */
! 				if (resultRelInfo->ri_FdwRoutine == NULL &&
! 					(resultRelInfo->ri_RelationDesc->rd_att->constr ||
! 					 check_partition_constr))
  					ExecConstraints(resultRelInfo, slot, estate, true);
  
  				if (useHeapMultiInsert)
***************
*** 2741,2750 **** CopyFrom(CopyState cstate)
  				{
  					List	   *recheckIndexes = NIL;
  
! 					/* OK, store the tuple and create index entries for it */
! 					heap_insert(resultRelInfo->ri_RelationDesc, tuple, mycid,
! 								hi_options, bistate);
  
  					if (resultRelInfo->ri_NumIndices > 0)
  						recheckIndexes = ExecInsertIndexTuples(slot,
  															   &(tuple->t_self),
--- 2768,2799 ----
  				{
  					List	   *recheckIndexes = NIL;
  
! 					/* OK, store the tuple */
! 					if (resultRelInfo->ri_FdwRoutine != NULL)
! 					{
! 						slot = resultRelInfo->ri_FdwRoutine->ExecForeignInsert(estate,
! 																			   resultRelInfo,
! 																			   slot,
! 																			   NULL);
! 
! 						if (slot == NULL)		/* "do nothing" */
! 							goto next_tuple;
! 
! 						/* FDW might have changed tuple */
! 						tuple = ExecMaterializeSlot(slot);
  
+ 						/*
+ 						 * AFTER ROW Triggers might reference the tableoid
+ 						 * column, so initialize t_tableOid before evaluating
+ 						 * them.
+ 						 */
+ 						tuple->t_tableOid = RelationGetRelid(resultRelInfo->ri_RelationDesc);
+ 					}
+ 					else
+ 						heap_insert(resultRelInfo->ri_RelationDesc, tuple,
+ 									mycid, hi_options, bistate);
+ 
+ 					/* And create index entries for it */
  					if (resultRelInfo->ri_NumIndices > 0)
  						recheckIndexes = ExecInsertIndexTuples(slot,
  															   &(tuple->t_self),
***************
*** 2762,2774 **** CopyFrom(CopyState cstate)
  			}
  
  			/*
! 			 * We count only tuples not suppressed by a BEFORE INSERT trigger;
! 			 * this is the same definition used by execMain.c for counting
! 			 * tuples inserted by an INSERT command.
  			 */
  			processed++;
  		}
  
  		/* Restore the saved ResultRelInfo */
  		if (saved_resultRelInfo)
  		{
--- 2811,2824 ----
  			}
  
  			/*
! 			 * We count only tuples not suppressed by a BEFORE INSERT trigger
! 			 * or FDW; this is the same definition used by nodeModifyTable.c
! 			 * for counting tuples inserted by an INSERT command.
  			 */
  			processed++;
  		}
  
+ next_tuple:
  		/* Restore the saved ResultRelInfo */
  		if (saved_resultRelInfo)
  		{
***************
*** 2809,2819 **** CopyFrom(CopyState cstate)
  
  	ExecResetTupleTable(estate->es_tupleTable, false);
  
  	ExecCloseIndices(resultRelInfo);
  
  	/* Close all the partitioned tables, leaf partitions, and their indices */
  	if (cstate->partition_tuple_routing)
! 		ExecCleanupTupleRouting(cstate->partition_tuple_routing);
  
  	/* Close any trigger target relations */
  	ExecCleanUpTriggerState(estate);
--- 2859,2875 ----
  
  	ExecResetTupleTable(estate->es_tupleTable, false);
  
+ 	/* Allow the FDW to shut down */
+ 	if (resultRelInfo->ri_FdwRoutine != NULL &&
+ 		resultRelInfo->ri_FdwRoutine->EndForeignInsert != NULL)
+ 		resultRelInfo->ri_FdwRoutine->EndForeignInsert(estate,
+ 													   resultRelInfo);
+ 
  	ExecCloseIndices(resultRelInfo);
  
  	/* Close all the partitioned tables, leaf partitions, and their indices */
  	if (cstate->partition_tuple_routing)
! 		ExecCleanupTupleRouting(mtstate, cstate->partition_tuple_routing);
  
  	/* Close any trigger target relations */
  	ExecCleanUpTriggerState(estate);
*** a/src/backend/executor/execMain.c
--- b/src/backend/executor/execMain.c
***************
*** 1178,1190 **** CheckValidResultRel(ResultRelInfo *resultRelInfo, CmdType operation)
  			switch (operation)
  			{
  				case CMD_INSERT:
- 
- 					/*
- 					 * If foreign partition to do tuple-routing for, skip the
- 					 * check; it's disallowed elsewhere.
- 					 */
- 					if (resultRelInfo->ri_PartitionRoot)
- 						break;
  					if (fdwroutine->ExecForeignInsert == NULL)
  						ereport(ERROR,
  								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
--- 1178,1183 ----
***************
*** 1374,1379 **** InitResultRelInfo(ResultRelInfo *resultRelInfo,
--- 1367,1373 ----
  
  	resultRelInfo->ri_PartitionCheck = partition_check;
  	resultRelInfo->ri_PartitionRoot = partition_root;
+ 	resultRelInfo->ri_PartitionIsValid = false;
  }
  
  /*
*** a/src/backend/executor/execPartition.c
--- b/src/backend/executor/execPartition.c
***************
*** 18,23 ****
--- 18,24 ----
  #include "catalog/pg_type.h"
  #include "executor/execPartition.h"
  #include "executor/executor.h"
+ #include "foreign/fdwapi.h"
  #include "mb/pg_wchar.h"
  #include "miscadmin.h"
  #include "nodes/makefuncs.h"
***************
*** 158,170 **** ExecSetupPartitionTupleRouting(ModifyTableState *mtstate, Relation rel)
  			proute->parent_child_tupconv_maps[i] =
  				convert_tuples_by_name(tupDesc, part_tupdesc,
  									   gettext_noop("could not convert row type"));
- 
- 			/*
- 			 * Verify result relation is a valid target for an INSERT.  An
- 			 * UPDATE of a partition-key becomes a DELETE+INSERT operation, so
- 			 * this check is required even when the operation is CMD_UPDATE.
- 			 */
- 			CheckValidResultRel(leaf_part_rri, CMD_INSERT);
  		}
  
  		proute->partitions[i] = leaf_part_rri;
--- 159,164 ----
***************
*** 338,350 **** ExecInitPartitionInfo(ModifyTableState *mtstate,
  					  estate->es_instrument);
  
  	/*
- 	 * Verify result relation is a valid target for an INSERT.  An UPDATE of a
- 	 * partition-key becomes a DELETE+INSERT operation, so this check is still
- 	 * required when the operation is CMD_UPDATE.
- 	 */
- 	CheckValidResultRel(leaf_part_rri, CMD_INSERT);
- 
- 	/*
  	 * Since we've just initialized this ResultRelInfo, it's not in any list
  	 * attached to the estate as yet.  Add it, so that it can be found later.
  	 *
--- 332,337 ----
***************
*** 461,466 **** ExecInitPartitionInfo(ModifyTableState *mtstate,
--- 448,454 ----
  		returningList = map_partition_varattnos(returningList, firstVarno,
  												partrel, firstResultRel,
  												NULL);
+ 		leaf_part_rri->ri_returningList = returningList;
  
  		/*
  		 * Initialize the projection itself.
***************
*** 631,636 **** ExecInitPartitionInfo(ModifyTableState *mtstate,
--- 619,651 ----
  }
  
  /*
+  * ExecInitForeignRouting
+  *		Let the FDW init itself for tuple routing for the partition
+  *
+  * We call this after performing CheckValidResultRel against the partition,
+  * avoid useless initialization for the FDW in ExecSetupPartitionTupleRouting
+  * and ExecInitPartitionInfo.
+  */
+ void
+ ExecInitForeignRouting(ModifyTableState *mtstate,
+ 					   EState *estate,
+ 					   ResultRelInfo *partRelInfo)
+ {
+ 	MemoryContext oldContext;
+ 
+ 	/*
+ 	 * Switch into per-query memory context.
+ 	 */
+ 	oldContext = MemoryContextSwitchTo(estate->es_query_cxt);
+ 
+ 	if (partRelInfo->ri_FdwRoutine != NULL &&
+ 		partRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
+ 		partRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate, partRelInfo);
+ 
+ 	MemoryContextSwitchTo(oldContext);
+ }
+ 
+ /*
   * ExecSetupChildParentMapForLeaf -- Initialize the per-leaf-partition
   * child-to-root tuple conversion map array.
   *
***************
*** 732,738 **** ConvertPartitionTupleSlot(TupleConversionMap *map,
   * Close all the partitioned tables, leaf partitions, and their indices.
   */
  void
! ExecCleanupTupleRouting(PartitionTupleRouting *proute)
  {
  	int			i;
  	int			subplan_index = 0;
--- 747,754 ----
   * Close all the partitioned tables, leaf partitions, and their indices.
   */
  void
! ExecCleanupTupleRouting(ModifyTableState *mtstate,
! 						PartitionTupleRouting *proute)
  {
  	int			i;
  	int			subplan_index = 0;
***************
*** 760,765 **** ExecCleanupTupleRouting(PartitionTupleRouting *proute)
--- 776,788 ----
  		if (resultRelInfo == NULL)
  			continue;
  
+ 		/* Allow any FDWs to shut down if they've been initialized */
+ 		if (resultRelInfo->ri_PartitionIsValid &&
+ 			resultRelInfo->ri_FdwRoutine != NULL &&
+ 			resultRelInfo->ri_FdwRoutine->EndForeignInsert != NULL)
+ 			resultRelInfo->ri_FdwRoutine->EndForeignInsert(mtstate->ps.state,
+ 														   resultRelInfo);
+ 
  		/*
  		 * If this result rel is one of the UPDATE subplan result rels, let
  		 * ExecEndPlan() close it. For INSERT or COPY,
*** a/src/backend/executor/nodeModifyTable.c
--- b/src/backend/executor/nodeModifyTable.c
***************
*** 1678,1688 **** ExecPrepareTupleRouting(ModifyTableState *mtstate,
  										proute, estate,
  										partidx);
  
! 	/* We do not yet have a way to insert into a foreign partition */
! 	if (partrel->ri_FdwRoutine)
! 		ereport(ERROR,
! 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
! 				 errmsg("cannot route inserted tuples to a foreign table")));
  
  	/*
  	 * Make it look like we are inserting into the partition.
--- 1678,1703 ----
  										proute, estate,
  										partidx);
  
! 	/*
! 	 * Verify the specified partition is a valid target for INSERT if we
! 	 * didn't yet.
! 	 *
! 	 * Note: an UPDATE of a partition-key becomes a DELETE+INSERT operation,
! 	 * so this check is required even when the mtstate operation is
! 	 * CMD_UPDATE.  The reason we do this check here rather than in
! 	 * ExecSetupPartitionTupleRouting() is to avoid aborting the UPDATE
! 	 * unnecessarily due to non-routable subplan partitions that may not be
! 	 * chosen for tuple routing after all.
! 	 */
! 	if (!partrel->ri_PartitionIsValid)
! 	{
! 		CheckValidResultRel(partrel, CMD_INSERT);
! 
! 		partrel->ri_PartitionIsValid = true;
! 
! 		/* Let the FDW init itself for tuple routing. */
! 		ExecInitForeignRouting(mtstate, estate, partrel);
! 	}
  
  	/*
  	 * Make it look like we are inserting into the partition.
***************
*** 2355,2360 **** ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
--- 2370,2376 ----
  		{
  			List	   *rlist = (List *) lfirst(l);
  
+ 			resultRelInfo->ri_returningList = rlist;
  			resultRelInfo->ri_projectReturning =
  				ExecBuildProjectionInfo(rlist, econtext, slot, &mtstate->ps,
  										resultRelInfo->ri_RelationDesc->rd_att);
***************
*** 2637,2643 **** ExecEndModifyTable(ModifyTableState *node)
  
  	/* Close all the partitioned tables, leaf partitions, and their indices */
  	if (node->mt_partition_tuple_routing)
! 		ExecCleanupTupleRouting(node->mt_partition_tuple_routing);
  
  	/*
  	 * Free the exprcontext
--- 2653,2659 ----
  
  	/* Close all the partitioned tables, leaf partitions, and their indices */
  	if (node->mt_partition_tuple_routing)
! 		ExecCleanupTupleRouting(node, node->mt_partition_tuple_routing);
  
  	/*
  	 * Free the exprcontext
*** a/src/include/executor/execPartition.h
--- b/src/include/executor/execPartition.h
***************
*** 118,123 **** extern ResultRelInfo *ExecInitPartitionInfo(ModifyTableState *mtstate,
--- 118,126 ----
  					ResultRelInfo *resultRelInfo,
  					PartitionTupleRouting *proute,
  					EState *estate, int partidx);
+ extern void ExecInitForeignRouting(ModifyTableState *mtstate,
+ 					   EState *estate,
+ 					   ResultRelInfo *partRelInfo);
  extern void ExecSetupChildParentMapForLeaf(PartitionTupleRouting *proute);
  extern TupleConversionMap *TupConvMapForLeaf(PartitionTupleRouting *proute,
  				  ResultRelInfo *rootRelInfo, int leaf_index);
***************
*** 125,130 **** extern HeapTuple ConvertPartitionTupleSlot(TupleConversionMap *map,
  						  HeapTuple tuple,
  						  TupleTableSlot *new_slot,
  						  TupleTableSlot **p_my_slot);
! extern void ExecCleanupTupleRouting(PartitionTupleRouting *proute);
  
  #endif							/* EXECPARTITION_H */
--- 128,134 ----
  						  HeapTuple tuple,
  						  TupleTableSlot *new_slot,
  						  TupleTableSlot **p_my_slot);
! extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
! 						PartitionTupleRouting *proute);
  
  #endif							/* EXECPARTITION_H */
*** a/src/include/foreign/fdwapi.h
--- b/src/include/foreign/fdwapi.h
***************
*** 97,102 **** typedef TupleTableSlot *(*ExecForeignDelete_function) (EState *estate,
--- 97,108 ----
  typedef void (*EndForeignModify_function) (EState *estate,
  										   ResultRelInfo *rinfo);
  
+ typedef void (*BeginForeignInsert_function) (ModifyTableState *mtstate,
+ 											 ResultRelInfo *rinfo);
+ 
+ typedef void (*EndForeignInsert_function) (EState *estate,
+ 										   ResultRelInfo *rinfo);
+ 
  typedef int (*IsForeignRelUpdatable_function) (Relation rel);
  
  typedef bool (*PlanDirectModify_function) (PlannerInfo *root,
***************
*** 204,209 **** typedef struct FdwRoutine
--- 210,217 ----
  	ExecForeignUpdate_function ExecForeignUpdate;
  	ExecForeignDelete_function ExecForeignDelete;
  	EndForeignModify_function EndForeignModify;
+ 	BeginForeignInsert_function BeginForeignInsert;
+ 	EndForeignInsert_function EndForeignInsert;
  	IsForeignRelUpdatable_function IsForeignRelUpdatable;
  	PlanDirectModify_function PlanDirectModify;
  	BeginDirectModify_function BeginDirectModify;
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
***************
*** 435,440 **** typedef struct ResultRelInfo
--- 435,443 ----
  	/* for removing junk attributes from tuples */
  	JunkFilter *ri_junkFilter;
  
+ 	/* list of RETURNING expressions */
+ 	List	   *ri_returningList;
+ 
  	/* for computing a RETURNING list */
  	ProjectionInfo *ri_projectReturning;
  
***************
*** 452,457 **** typedef struct ResultRelInfo
--- 455,463 ----
  
  	/* relation descriptor for root partitioned table */
  	Relation	ri_PartitionRoot;
+ 
+ 	/* true if valid target for tuple routing */
+ 	bool		ri_PartitionIsValid;
  } ResultRelInfo;
  
  /* ----------------

Reply via email to