Re: Asynchronous MergeAppend

Alena Rybakina Sat, 10 Aug 2024 13:24:59 -0700

Hi! Thank you for your work on this subject! I think this is a veryuseful optimization)

While looking through your code, I noticed some points that I thinkshould be taken into account. Firstly, I noticed only two tests toverify the functionality of this function and I think that this is notenough.Are you thinking about adding some tests with queries involving, forexample, join connections with different tables and unusual operators?

In addition, I have a question about testing your feature on abenchmark. Are you going to do this?


On 17.07.2024 16:24, Alexander Pyhalov wrote:

Hello.
I'd like to make MergeAppend node Async-capable like Append node.Nowadays when planner chooses MergeAppend plan, asynchronous executionis not possible. With attached patches you can see plans like
EXPLAIN (VERBOSE, COSTS OFF)
SELECT * FROM async_pt WHERE b % 100 = 0 ORDER BY b, a;
                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
 Merge Append
   Sort Key: async_pt.b, async_pt.a
   ->  Async Foreign Scan on public.async_p1 async_pt_1
         Output: async_pt_1.a, async_pt_1.b, async_pt_1.c
Remote SQL: SELECT a, b, c FROM public.base_tbl1 WHERE (((b %100) = 0)) ORDER BY b ASC NULLS LAST, a ASC NULLS LAST
   ->  Async Foreign Scan on public.async_p2 async_pt_2
         Output: async_pt_2.a, async_pt_2.b, async_pt_2.c
Remote SQL: SELECT a, b, c FROM public.base_tbl2 WHERE (((b %100) = 0)) ORDER BY b ASC NULLS LAST, a ASC NULLS LAST
This can be quite profitable (in our test cases you can gain up to twotimes better speed with MergeAppend async execution on remote servers).
Code for asynchronous execution in Merge Append was mostly borrowedfrom Append node.
What significantly differs - in ExecMergeAppendAsyncGetNext() you mustreturn tuple from the specified slot.Subplan number determines tuple slot where data should be retrievedto. When subplan is ready to provide some data,it's cached in ms_asyncresults. When we get tuple for subplan,specified in ExecMergeAppendAsyncGetNext(),ExecMergeAppendAsyncRequest() returns true and loop inExecMergeAppendAsyncGetNext() ends. We can fetch data forsubplans which either don't have cached result ready or have alreadyreturned them to the upper node. Thisflag is stored in ms_has_asyncresults. As we can get data for somesubplan either earlier or after loop in ExecMergeAppendAsyncRequest(),
we check this flag twice in this function.
Unlike ExecAppendAsyncEventWait(), it seemsExecMergeAppendAsyncEventWait() doesn't need a timeout - as there's noneed to get resultfrom synchronous subplan if a tuple form async one was explicitlyrequested.
Also we had to fix postgres_fdw to avoid directly looking at Appendfields. Perhaps, accesors to Append fields look strange, but allowsto avoid some code duplication. I suppose, duplication could be evenless if we reworked async Append implementation, but so far I haven't
tried to do this to avoid big diff from master.
Also mark_async_capable() believes that path corresponds to plan. Thiscan be not true when create_[merge_]append_plan() inserts sort node.In this case mark_async_capable() can treat Sort plan node as someother and crash, so there's a small fix for this.

I think you should add this explanation to the commit message becausewithout it it's hard to understand the full picture of how your code works.


--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Asynchronous MergeAppend

Reply via email to