Hi! Thank you for your work on this subject! I think this is a very useful optimization)

While looking through your code, I noticed some points that I think should be taken into account. Firstly, I noticed only two tests to verify the functionality of this function and I think that this is not enough. Are you thinking about adding some tests with queries involving, for example, join connections with different tables and unusual operators?

In addition, I have a question about testing your feature on a benchmark. Are you going to do this?

On 17.07.2024 16:24, Alexander Pyhalov wrote:
Hello.

I'd like to make MergeAppend node Async-capable like Append node. Nowadays when planner chooses MergeAppend plan, asynchronous execution is not possible. With attached patches you can see plans like

EXPLAIN (VERBOSE, COSTS OFF)
SELECT * FROM async_pt WHERE b % 100 = 0 ORDER BY b, a;
                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
 Merge Append
   Sort Key: async_pt.b, async_pt.a
   ->  Async Foreign Scan on public.async_p1 async_pt_1
         Output: async_pt_1.a, async_pt_1.b, async_pt_1.c
         Remote SQL: SELECT a, b, c FROM public.base_tbl1 WHERE (((b % 100) = 0)) ORDER BY b ASC NULLS LAST, a ASC NULLS LAST
   ->  Async Foreign Scan on public.async_p2 async_pt_2
         Output: async_pt_2.a, async_pt_2.b, async_pt_2.c
         Remote SQL: SELECT a, b, c FROM public.base_tbl2 WHERE (((b % 100) = 0)) ORDER BY b ASC NULLS LAST, a ASC NULLS LAST

This can be quite profitable (in our test cases you can gain up to two times better speed with MergeAppend async execution on remote servers).

Code for asynchronous execution in Merge Append was mostly borrowed from Append node.

What significantly differs - in ExecMergeAppendAsyncGetNext() you must return tuple from the specified slot. Subplan number determines tuple slot where data should be retrieved to. When subplan is ready to provide some data, it's cached in ms_asyncresults. When we get tuple for subplan, specified in ExecMergeAppendAsyncGetNext(), ExecMergeAppendAsyncRequest() returns true and loop in ExecMergeAppendAsyncGetNext() ends. We can fetch data for subplans which either don't have cached result ready or have already returned them to the upper node. This flag is stored in ms_has_asyncresults. As we can get data for some subplan either earlier or after loop in ExecMergeAppendAsyncRequest(),
we check this flag twice in this function.
Unlike ExecAppendAsyncEventWait(), it seems ExecMergeAppendAsyncEventWait() doesn't need a timeout - as there's no need to get result from synchronous subplan if a tuple form async one was explicitly requested.

Also we had to fix postgres_fdw to avoid directly looking at Append fields. Perhaps, accesors to Append fields look strange, but allows to avoid some code duplication. I suppose, duplication could be even less if we reworked async Append implementation, but so far I haven't
tried to do this to avoid big diff from master.

Also mark_async_capable() believes that path corresponds to plan. This can be not true when create_[merge_]append_plan() inserts sort node. In this case mark_async_capable() can treat Sort plan node as some other and crash, so there's a small fix for this.

I think you should add this explanation to the commit message because without it it's hard to understand the full picture of how your code works.

--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Reply via email to