From: Craig Ringer <craig.rin...@enterprisedb.com> > It was not my intention to hold this patch up or greatly expand its > scope. I'll spend some time testing it out and have a closer read soon > to see if I can help progress it.
Thank you, I'm relieved to hear that. Last weekend, I was scared of a possible mood that's something like "We won't accept the insert speedup patch for foreign tables unless you take full advantage of pipelining and achieve maximum conceivable speed!" > I thought I gave it at the time, and a demo program. IIRC it was just > doing small multi row inserts or single row inserts. Latency would've > been a couple of hundred ms probably, I think I did something like > running on my laptop (Australia, ADSL) to a server on AWS US or EU. a couple of hundred ms, so that would be dominant in each prepare-send-execute-receive, possibly even for batch insert with hundreds of rows in each batch. Then, the synchronous batch insert of the current patch may achieve a few hundreds times speedup compared to a single row inserts when the batch size is hundreds or more. > > I'd like to check other DBMSs and your rich references for the FDW > > interface. > (My first intuition is that many major DBMSs might not have client C APIs that > can be used to implement an async pipelining FDW interface. > > Likely correct for C APIs of other traditional DBMSes. I'd be less > sure about newer non SQL ones, especially cloud oriented. For example > DynamoDB supports at least async requests in the Java client [3] and > C++ client [4]; it's not immediately clear if requests can be > pipelined, but the API suggests they can. I've checked ODBC, MySQL, Microsoft Synapse Analytics, Redshift, and BigQuery, guessing that the data warehouse may have asynchronous/pipelining API that enables efficient data integration/migration. But none of them had one. (I seem to have spent too long and am a bit tired... but it was a bit fun as well.) They all support INSERT with multiple records in its VALUES clause. So, it will be useful to provide a synchronous batch insert FDW API. I guess Oracle's OCI has an asynchronous API, but I didn't check it. As an aside, MySQL 8.0.16 added support for asynchronous execution in its C API, but it allows only one active SQL statement in each connection. Likewise, although the ODBC standard defines asynchronous execution (SQLSetStmtAttr(SQL_ASYNC_ENABLE) and SQLCompleteAsync), SQL Server and Synapse Analytics only allows only one active statement per connection. psqlODBC doesn't support asynchronous execution. > Most things with a REST-like API can do a fair bit of concurrency > though. Multiple async nonblocking HTTP connections can be serviced at > once. Or HTTP/1.1 pipelining can be used [1], or even better HTTP/2.0 > streams [2]. This is relevant for any REST-like API. I'm not sure if this is related, Google deprecated Batch HTTP API [1]. [1] https://cloud.google.com/bigquery/batch Regards Takayuki Tsunakawa