On Thu, Feb 18, 2021 at 3:16 PM Kyotaro Horiguchi <horikyota....@gmail.com> wrote: > At Thu, 18 Feb 2021 11:51:59 +0900, Etsuro Fujita <etsuro.fuj...@gmail.com> > wrote in > > I noticed that this doesn’t work for cases where ForeignScans are > > executed inside functions, and I don’t have any simple solution for > > Ah, concurrent fetches in different plan trees? (For fairness, I > hadn't noticed that case:p) The same can happen when an extension that > is called via hooks.
Yeah, consider a plan containing a FunctionScan that invokes a query like e.g., “SELECT * FROM foreign_table” via SPI. > > So I’m getting back to what Horiguchi-san proposed for > > postgres_fdw to handle concurrent fetches from a remote server > > performed by multiple ForeignScan nodes that use the same connection. > > As discussed before, we would need to create a scheduler for > > performing such fetches in a more optimized way to avoid a performance > > degradation in some cases, but that wouldn’t be easy. > > If the "degradation" means degradation caused by repeated creation of > remote cursors, anyway every node on the same connection create its > own connection named as "c<n>" and never "re"created in any case. > > If the "degradation" means that my patch needs to wait for the > previous prefetching query to return tuples before sending a new query > (vacate_connection()), it is just moving the wait from just before > sending the new query to just before fetching the next round of the > previous node. The only case it becomes visible degradation is where > the tuples in the next round is not wanted by the upper nodes. The latter. And yeah, typical cases where the performance degradation occurs would be queries with LIMIT, as discussed in [1]. I’m not concerned about postgres_fdw modified to process an in-progress fetch by a ForeignScan before starting a new asynchronous/synchronous fetch by another ForeignScan using the same connection. Actually, that seems pretty reasonable to me, so I’d like to use that part in your patch in the next version. My concern is that postgresIterateForeignScan() was modified to start another asynchronous fetch from a remote table (if possible) right after doing fetch_received_data() for the remote table, because aggressive prefetching like that may increase the probability that ForeignScans using the same connection conflict with each other, leading to a large performance degradation. (Another issue with that would be that the fsstate->tuples array for the remote table may be enlarged indefinitely.) Whether the degradation is acceptable or not would depend on the user, and needless to say, the smaller degradation would be more acceptable. So I’ll update the patch using your patch without the postgresIterateForeignScan() change. > > In his proposal, > > postgres_fdw was modified to perform prefetching pretty aggressively, > > so I mean removing aggressive prefetching. I think we could add it to > > postgres_fdw later maybe as the server/table options. > That was the natural extension from non-aggresive prefetching. I also suppose that that would improve the performance in some cases. Let’s leave that for future work. > However, maybe we can live without that since if some needs more > speed, it is enought to give every remote tables a dedicate > connection. Yeah, I think so too. Thanks! Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/CAPmGK16E1erFV9STg8yokoewY6E-zEJtLzHUJcQx%2B3dyivCT%3DA%40mail.gmail.com