On Thu, Feb 4, 2021 at 7:21 PM Etsuro Fujita <etsuro.fuj...@gmail.com> wrote: > On Mon, Feb 1, 2021 at 12:06 PM Etsuro Fujita <etsuro.fuj...@gmail.com> wrote: > > Rather than doing so, I'd like to propose to allow > > FDWs to disable async execution of them in problematic cases by > > themselves during executor startup in the first cut. What I have in > > mind for that is: > > > > 1) For an FDW that has async-capable ForeignScan(s), we allow the FDW > > to record, for each of the async-capable and non-async-capable > > ForeignScan(s), the information on a connection to be used for the > > ForeignScan into EState during BeginForeignScan(). > > > > 2) After doing ExecProcNode() to each SubPlan and the main query tree > > in InitPlan(), we give the FDW a chance to a) reconsider, for each of > > the async-capable ForeignScan(s), whether the ForeignScan can be > > executed asynchronously as planned, based on the information stored > > into EState in #1, and then b) disable async execution of the > > ForeignScan if not. > > s/ExecProcNode()/ExecInitNode()/. Sorry for that. I’ll post an > updated patch for this in a few days.
I created a WIP patch for this. For #2, I added a new callback routine ReconsiderAsyncForeignScan(). The routine for postgres_fdw postgresReconsiderAsyncForeignScan() is pretty simple: async execution of an async-capable ForeignScan is disabled if the connection used for it is used in other parts of the query plan tree except async subplans just below the parent Append. Here is a running example: postgres=# create table t1 (a int, b int, c text); postgres=# create table t2 (a int, b int, c text); postgres=# create foreign table p1 (a int, b int, c text) server server1 options (table_name 't1'); postgres=# create foreign table p2 (a int, b int, c text) server server2 options (table_name 't2'); postgres=# create table pt (a int, b int, c text) partition by range (a); postgres=# alter table pt attach partition p1 for values from (10) to (20); postgres=# alter table pt attach partition p2 for values from (20) to (30); postgres=# insert into p1 select 10 + i % 10, i, to_char(i, 'FM0000') from generate_series(0, 99) i; postgres=# insert into p2 select 20 + i % 10, i, to_char(i, 'FM0000') from generate_series(0, 99) i; postgres=# analyze pt; postgres=# create table loct (a int, b int); postgres=# create foreign table ft (a int, b int) server server1 options (table_name 'loct'); postgres=# insert into ft select i, i from generate_series(0, 99) i; postgres=# analyze ft; postgres=# create view v as select * from ft; postgres=# explain verbose select * from pt, v where pt.b = v.b and v.b = 99; QUERY PLAN ----------------------------------------------------------------------------------------- Nested Loop (cost=200.00..306.84 rows=2 width=21) Output: pt.a, pt.b, pt.c, ft.a, ft.b -> Foreign Scan on public.ft (cost=100.00..102.27 rows=1 width=8) Output: ft.a, ft.b Remote SQL: SELECT a, b FROM public.loct WHERE ((b = 99)) -> Append (cost=100.00..204.55 rows=2 width=13) -> Foreign Scan on public.p1 pt_1 (cost=100.00..102.27 rows=1 width=13) Output: pt_1.a, pt_1.b, pt_1.c Remote SQL: SELECT a, b, c FROM public.t1 WHERE ((b = 99)) -> Async Foreign Scan on public.p2 pt_2 (cost=100.00..102.27 rows=1 width=13) Output: pt_2.a, pt_2.b, pt_2.c Remote SQL: SELECT a, b, c FROM public.t2 WHERE ((b = 99)) (12 rows) For this query, while p2 is executed asynchronously, p1 isn’t as it uses the same connection with ft. BUT: postgres=# create role view_owner SUPERUSER; postgres=# create user mapping for view_owner server server1; postgres=# alter view v owner to view_owner; postgres=# explain verbose select * from pt, v where pt.b = v.b and v.b = 99; QUERY PLAN ----------------------------------------------------------------------------------------- Nested Loop (cost=200.00..306.84 rows=2 width=21) Output: pt.a, pt.b, pt.c, ft.a, ft.b -> Foreign Scan on public.ft (cost=100.00..102.27 rows=1 width=8) Output: ft.a, ft.b Remote SQL: SELECT a, b FROM public.loct WHERE ((b = 99)) -> Append (cost=100.00..204.55 rows=2 width=13) -> Async Foreign Scan on public.p1 pt_1 (cost=100.00..102.27 rows=1 width=13) Output: pt_1.a, pt_1.b, pt_1.c Remote SQL: SELECT a, b, c FROM public.t1 WHERE ((b = 99)) -> Async Foreign Scan on public.p2 pt_2 (cost=100.00..102.27 rows=1 width=13) Output: pt_2.a, pt_2.b, pt_2.c Remote SQL: SELECT a, b, c FROM public.t2 WHERE ((b = 99)) (12 rows) in this setup, p1 is executed asynchronously as ft doesn’t use the same connection with p1. I added to postgresReconsiderAsyncForeignScan() this as well: even if the connection isn’t used in the other parts, async execution of an async-capable ForeignScan is disabled if the subplans of the Append are all async-capable, and they use the same connection, because in that case the subplans won’t be parallelized at all, and the overhead of async execution may cause a performance degradation. Attached is an updated version of the patch. Sorry for the delay. Best regards, Etsuro Fujita
async-wip-2021-02-10.patch
Description: Binary data