On Wed, Feb 17, 2021 at 5:46 PM Amit Langote <amitlangot...@gmail.com> wrote: > On Wed, Feb 17, 2021 at 12:04 AM Tomas Vondra > <tomas.von...@enterprisedb.com> wrote: > > On 2/16/21 10:25 AM, Amit Langote wrote: > > > On Tue, Feb 16, 2021 at 1:36 AM Tomas Vondra > > >> Thanks for the patch, it seems fine to me. > > > > > > Thanks for checking. > > > > > >> I wonder it the commit > > >> message needs some tweaks, though. At the moment it says: > > >> > > >> Prevent FDW insert batching during cross-partition updates > > >> > > >> but what the patch seems to be doing is simply initializing the info > > >> only for CMD_INSERT operations. Which does the trick, but it affects > > >> everything, i.e. all updates, no? Not just cross-partition updates. > > > > > > You're right. Please check the message in the updated patch. > > > > Thanks. I'm not sure I understand what "FDW may not be able to handle > > both the original update operation and the batched insert operation > > being performed at the same time" means. I mean, if we translate the > > UPDATE into DELETE+INSERT, then we don't run both the update and insert > > at the same time, right? What exactly is the problem with allowing > > batching for inserts in cross-partition updates? > > Sorry, I hadn't shared enough details of my investigations when I > originally ran into this. Such as that I had considered implementing > the use of batching for these inserts too but had given up. > > Now that you mention it, I think I gave a less convincing reason for > why we should avoid doing it at all. Maybe it would have been more > right to say that it is the core code, not necessarily the FDWs, that > currently fails to deal with the use of batching by the insert > component of a cross-partition update. Those failures could be > addressed as I'll describe below. > > For postgres_fdw, postgresGetForeignModifyBatchSize() could be taught > to simply use the PgFdwModifyTable that is installed to handle the > insert component of a cross-partition update (one can get that one via > aux_fmstate field of the original PgFdwModifyState). However, even > though that's fine for postgres_fdw to do, what worries (had worried) > me is that it also results in scribbling on ri_BatchSize that the core > code may see to determine what to do with a particular tuple, and I > just have to hope that nodeModifyTable.c doesn't end up doing anything > unwarranted with the original update based on seeing a non-zero > ri_BatchSize. AFAICS, we are fine on that front. > > That said, there are some deficiencies in the code that have to be > addressed before we can let postgres_fdw do as mentioned above. For > example, the code in ExecModifyTable() that runs after breaking out of > the loop to insert any remaining batched tuples appears to miss the > tuples batched by such inserts. Apparently, that is because the > ResultRelInfos used by those inserts are not present in > es_tuple_routing_result_relations. Turns out I had forgotten that > execPartition.c doesn't add the ResultRelInfos to that list if they > are made by ExecInitModifyTable() for the original update operation > and simply reused by ExecFindPartition() when tuples were routed to > those partitions. It can be "fixed" by reverting to the original > design in Tsunakawa-san's patch where the tuple routing result > relations were obtained from the PartitionTupleRouting data structure, > which fortunately stores all tuple routing result relations. (Sorry, > I gave wrong advice in [1] in retrospect.) > > > On a closer look, it seems the problem actually lies in a small > > inconsistency between create_foreign_modify and ExecInitRoutingInfo. The > > former only set batch_size for CMD_INSERT while the latter called the > > BatchSize() for all operations, expecting >= 1 result. So we may either > > relax create_foreign_modify and set batch_size for all DML, or make > > ExecInitRoutingInfo stricter (which is what the patches here do). > > I think we should be fine if we make > postgresGetForeignModifyBatchSize() use the correct PgFdwModifyState > as described above. We can be sure that we are not mixing the > information used by the batched insert with that of the original > unbatched update. > > > Is there a reason not to do the first thing, allowing batching of > > inserts during cross-partition updates? I tried to do that, but it > > dawned on me that we can't mix batched and un-batched operations, e.g. > > DELETE + INSERT, because that'd break the order of execution, leading to > > bogus results in case the same row is modified repeatedly, etc. > > Actually, postgres_fdw only supports moving a row into a partition (as > part of a cross-partition update that is) if it has already finished > performing any updates on it. So there is no worry of rows that are > moved into a partition subsequently getting updated due to the > original command. > > The attached patch implements the changes necessary to make these > inserts use batching too. > > [1] > https://www.postgresql.org/message-id/CA%2BHiwqEbnhwVJMsukTP-S9Kv1ynC7Da3yuqSPZC0Y7oWWOwoHQ%40mail.gmail.com
Oops, I had mistakenly not hit "Reply All". Attaching the patch again. -- Amit Langote EDB: http://www.enterprisedb.com
v5-0001-Allow-batching-of-inserts-to-occur-in-some-cases.patch
Description: Binary data