Thank you for the detailed analysis, I'll look into it too.  (The times have 
changed...)

From: Bharath Rupireddy <bharath.rupireddyforpostg...@gmail.com>
> Well, one might think to add more blocks at a time, say
> Min(1024, lockWaiters * 128/256/512) than currently extraBlocks =
> Min(512, lockWaiters * 20);. This will work (i.e. we don't see any
> regression with parallel inserts in CTAS patches), but it can't be a
> practical solution. Because the total pages for the relation will be
> more with many pages having more free space. Furthermore, the future
> sequential scans on that relation might take a lot of time.

> Otherwise, the similar speed up can be observed when the BAS_BULKWRITE
> is increased a bit from the current 16MB to some other reasonable
> value. I earlier tried these experiments.
> 
> Otherwise, as I said in [1], we can also increase the number of extra
> blocks added at a time, say Min(1024, lockWaiters * 128/256/512) than
> currently extraBlocks = Min(512, lockWaiters * 20);. This will also
> give some speedup and we don't see any regression with parallel
> inserts in CTAS patches.
> 
> But, I'm not so sure that the hackers will agree any of the above as a
> practical solution to the "relation extension" problem.

I think I understand your concern about resource consumption and impact on 
other concurrently running jobs (OLTP, data analysis.)

OTOH, what's the situation like when the user wants to run CTAS, and further, 
wants to speed it up by using parallelism?  isn't it okay to let the (parallel) 
CTAS use as much as it wants?  At least, I think we can provide another mode 
for it, like Oracle provides conditional path mode and direct path mode for 
INSERT and data loading.

What do we want to do to maximize parallel CTAS speedup if we were a bit 
unshackled from the current constraints (alignment with existing code, impact 
on other concurrent workloads)?

* Use as many shared buffers as possible to decrease WAL flush.
Otherwise, INSERT SELECT may be faster?

* Minimize relation extension (= increase the block count per extension)
posix_fallocate() would help too.

* Allocate added pages among parallel workers, and each worker fills pages to 
their full capacity.
The worker that extended the relation stores the page numbers of added pages in 
shared memory for parallel execution.  Each worker gets a page from there after 
waiting for the relation extension lock, instead of using FSM.
The last pages that the workers used will be filled halfway, but the amount of 
unused space should be low compared to the total table size.



Regards
Takayuki Tsunakawa

Reply via email to