On 11/23/20 3:17 AM, tsunakawa.ta...@fujitsu.com wrote: > From: Tomas Vondra <tomas.von...@enterprisedb.com> >> I don't think this is usable in practice, because a single session >> may be using multiple FDW servers, with different implementations, >> latency to the data nodes, etc. It's unlikely a single GUC value >> will be suitable for all of them. > > That makes sense. The row size varies from table to table, so the > user may want to tune this option to reduce memory consumption. > > I think the attached patch has reflected all your comments. I hope > this will pass.. >
Thanks - I didn't have time for a thorough review at the moment, so I only skimmed through the diff and did a couple very simple tests. And I think overall it looks quite nice. A couple minor comments/questions: 1) We're calling it "batch_size" but the API function is named postgresGetMaxBulkInsertTuples(). Perhaps we should rename the function to postgresGetModifyBatchSize()? That has the advantage it'd work if we ever add support for batching to UPDATE/DELETE. 2) Do we have to lookup the batch_size in create_foreign_modify (in server/table options)? I'd have expected to look it up while planning the modify and then pass it through the list, just like the other FdwModifyPrivateIndex stuff. But maybe that's not possible. 3) That reminds me - should we show the batching info on EXPLAIN? That seems like a fairly interesting thing to show to the user. Perhaps showing the average batch size would also be useful? Or maybe not, we create the batches as large as possible, with the last one smaller. 4) It seems that ExecInsert executes GetMaxBulkInsertTuples() over and over for every tuple. I don't know it that has measurable impact, but it seems a bit excessive IMO. I don't think we should support the batch size changing during execution (seems tricky). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company