Hi Bharath-san,

From: Bharath Rupireddy <bharath.rupireddyforpostg...@gmail.com>
Sent: Friday, May 21, 2021 6:49 PM
> 
> On Fri, May 21, 2021 at 3:46 PM Amit Kapila <amit.kapil...@gmail.com> wrote:
> >
> > On Fri, Mar 19, 2021 at 11:02 AM Bharath Rupireddy
> > <bharath.rupireddyforpostg...@gmail.com> wrote:
> > >
> > > On Wed, Jan 27, 2021 at 1:47 PM Bharath Rupireddy
> > > <bharath.rupireddyforpostg...@gmail.com> wrote:
> > > >
> > >
> > > I analyzed performance of parallel inserts in CTAS for different
> > > cases with tuple size 32bytes, 59bytes, 241bytes and 1064bytes. We
> > > could gain if the tuple sizes are lower. But if the tuple size is
> > > larger i..e 1064bytes, there's a regression with parallel inserts.
> > > Upon further analysis, it turned out that the parallel workers are
> > > requiring frequent extra blocks addition while concurrently
> > > extending the relation(in RelationAddExtraBlocks) and the majority
> > > of the time spent is going into flushing those new empty
> > > pages/blocks onto the disk.
> > >
> >
> > How you have ensured that the cost is due to the flushing of pages?
> > AFAICS, we don't flush the pages rather just write them and then
> > register those to be flushed by checkpointer, now it is possible that
> > the checkpointer sync queue gets full and the backend has to write by
> > itself but have we checked that? I think we can check via wait events,
> > if it is due to flush then we should see a lot of file sync
> > (WAIT_EVENT_DATA_FILE_SYNC) wait events.  The other possibility could
> > be that the free pages added to FSM by one worker are not being used
> > by another worker due to some reason. Can we debug and check if the
> > pages added by one worker are being used by another worker?
> 
> Thanks! I will work on the above points sometime later.

I noticed one place which could be one of the reasons that cause the 
performance degradation.

+               /*
+                * We don't need to skip contacting FSM while inserting tuples 
for
+                * parallel mode, while extending the relations, workers 
instead of
+                * blocking on a page while another worker is inserting, can 
check the
+                * FSM for another page that can accommodate the tuples. This 
results
+                * in major benefit for parallel inserts.
+                */
+               myState->ti_options = 0;

I am not quite sure that disabling the " SKIP FSM " in parallel worker will 
bring performance gain.
In my test environment, if I change this code to use option " 
TABLE_INSERT_SKIP_FSM ", then there
seems no performance degradation . Could you please have a try on it ?
(I test with the SQL you provided earlier[1])

[1] 
https://www.postgresql.org/message-id/CALj2ACWFvNm4d_uqT2iECPqaXZjEd-O%2By8xbghvqXeMLj0pxGw%40mail.gmail.com

Best regards,
houzj

Reply via email to