On Tue, Aug 9, 2022 at 4:45 AM Etsuro Fujita <etsuro.fuj...@gmail.com> wrote:
> On Tue, Jul 26, 2022 at 7:19 PM Etsuro Fujita <etsuro.fuj...@gmail.com> > wrote: > > Yeah, I think the patch is getting better, but I noticed some issues, > > so I'm working on them. I think I can post a new version in the next > > few days. > > * When running AFTER ROW triggers in CopyMultiInsertBufferFlush(), the > patch uses the slots passed to ExecForeignBatchInsert(), not the ones > returned by the callback function, but I don't think that that is > always correct, as the documentation about the callback function says: > > The return value is an array of slots containing the data that was > actually inserted (this might differ from the data supplied, for > example as a result of trigger actions.) > The passed-in <literal>slots</literal> can be re-used for this > purpose. > > postgres_fdw re-uses the passed-in slots, but other FDWs might not, so > I modified the patch to reference the returned slots when running the > AFTER ROW triggers. I also modified the patch to initialize the > tts_tableOid. Attached is an updated patch, in which I made some > minor adjustments to CopyMultiInsertBufferFlush() as well. > > * The patch produces incorrect error context information: > > create extension postgres_fdw; > create server loopback foreign data wrapper postgres_fdw options > (dbname 'postgres'); > create user mapping for current_user server loopback; > create table t1 (f1 int, f2 text); > create foreign table ft1 (f1 int, f2 text) server loopback options > (table_name 't1'); > alter table t1 add constraint t1_f1positive check (f1 >= 0); > alter foreign table ft1 add constraint ft1_f1positive check (f1 >= 0); > > — single insert > copy ft1 from stdin; > Enter data to be copied followed by a newline. > End with a backslash and a period on a line by itself, or an EOF signal. > >> -1 foo > >> 1 bar > >> \. > ERROR: new row for relation "t1" violates check constraint "t1_f1positive" > DETAIL: Failing row contains (-1, foo). > CONTEXT: remote SQL command: INSERT INTO public.t1(f1, f2) VALUES ($1, $2) > COPY ft1, line 1: "-1 foo" > > — batch insert > alter server loopback options (add batch_size '2'); > copy ft1 from stdin; > Enter data to be copied followed by a newline. > End with a backslash and a period on a line by itself, or an EOF signal. > >> -1 foo > >> 1 bar > >> \. > ERROR: new row for relation "t1" violates check constraint "t1_f1positive" > DETAIL: Failing row contains (-1, foo). > CONTEXT: remote SQL command: INSERT INTO public.t1(f1, f2) VALUES > ($1, $2), ($3, $4) > COPY ft1, line 3 > > In single-insert mode the error context information is correct, but in > batch-insert mode it isn’t (i.e., the line number isn’t correct). > > The error occurs on the remote side, so I'm not sure if there is a > simple fix. What I came up with is to just suppress error context > information other than the relation name, like the attached. What do > you think about that? > > (In CopyMultiInsertBufferFlush() your patch sets cstate->cur_lineno to > buffer->linenos[i] even when running AFTER ROW triggers for the i-th > row returned by ExecForeignBatchInsert(), but that wouldn’t always be > correct, as the i-th returned row might not correspond to the i-th row > originally stored in the buffer as the callback function returns only > the rows that were actually inserted on the remote side. I think the > proposed fix would address this issue as well.) > > * The patch produces incorrect row count in cases where some/all of > the rows passed to ExecForeignBatchInsert() weren’t inserted on the > remote side: > > create function trig_null() returns trigger as $$ begin return NULL; > end $$ language plpgsql; > create trigger trig_null before insert on t1 for each row execute > function trig_null(); > > — single insert > alter server loopback options (drop batch_size); > copy ft1 from stdin; > Enter data to be copied followed by a newline. > End with a backslash and a period on a line by itself, or an EOF signal. > >> 0 foo > >> 1 bar > >> \. > COPY 0 > > — batch insert > alter server loopback options (add batch_size '2'); > copy ft1 from stdin; > Enter data to be copied followed by a newline. > End with a backslash and a period on a line by itself, or an EOF signal. > >> 0 foo > >> 1 bar > >> \. > COPY 2 > > The row count is correct in single-insert mode, but isn’t in batch-insert > mode. > > The reason is that in batch-insert mode the row counter is updated > immediately after adding the row to the buffer, not after doing > ExecForeignBatchInsert(), which might ignore the row. To fix, I > modified the patch to delay updating the row counter (and the progress > of the COPY command) until after doing the callback function. For > consistency, I also modified the patch to delay it even when batching > into plain tables. IMO I think that that would be more consistent > with the single-insert mode, as in that mode we update them after > writing the tuple out to the table or sending it to the remote side. > > * I modified the patch so that when batching into foreign tables we > skip useless steps in CopyMultiInsertBufferInit() and > CopyMultiInsertBufferCleanup(). > > That’s all I have for now. Sorry for the delay. > > Best regards, > Etsuro Fujita > Hi, + /* If any rows were inserted, run AFTER ROW INSERT triggers. */ ... + for (i = 0; i < inserted; i++) + { + TupleTableSlot *slot = rslots[i]; ... + slot->tts_tableOid = + RelationGetRelid(resultRelInfo->ri_RelationDesc); It seems the return value of `RelationGetRelid(resultRelInfo->ri_RelationDesc)` can be stored in a variable outside the for loop. Inside the for loop, assign this variable to slot->tts_tableOid. Cheers