Hi,

On 2022-03-02 06:46:23 +1300, Thomas Munro wrote:
> > I do think it's worth giving that sleep a proper wait event though, even in 
> > the back branches.
> 
> I'm thinking that 0002 should be back-patched all the way, but 0001
> could be limited to 14.

No strong opinion on back to where to backpatch. It'd be nice to have a proper
wait event everywhere, but especially < 12 it looks different enough to be
some effort.


> From a9344bb2fb2a363bec4be526f87560cb212ca10b Mon Sep 17 00:00:00 2001
> From: Thomas Munro <thomas.mu...@gmail.com>
> Date: Mon, 28 Feb 2022 11:27:05 +1300
> Subject: [PATCH v2 1/3] Wake up for latches in CheckpointWriteDelay().

LGTM. Would be nice to have this fixed soon, even if it's just to reduce test
times :)



> From 1eb0266fed7ccb63a2430e4fbbaef2300f2aa0d0 Mon Sep 17 00:00:00 2001
> From: Thomas Munro <thomas.mu...@gmail.com>
> Date: Tue, 1 Mar 2022 11:38:27 +1300
> Subject: [PATCH v2 2/3] Fix waiting in RegisterSyncRequest().

LGTM.


> From 50060e5a0ed66762680ddee9e30acbad905c6e98 Mon Sep 17 00:00:00 2001
> From: Thomas Munro <thomas.mu...@gmail.com>
> Date: Tue, 1 Mar 2022 17:34:43 +1300
> Subject: [PATCH v2 3/3] Use condition variable to wait when sync request queue
>  is full.
> 
> Previously, in the (hopefully) rare case that we need to wait for the
> checkpointer to create space in the sync request queue, we'd enter a
> sleep/retry loop.  Instead, create a condition variable so the
> checkpointer can wake us up whenever there is a transition from 'full'
> to 'not full'.


> @@ -1076,10 +1078,11 @@ RequestCheckpoint(int flags)
>   * to perform its own fsync, which is far more expensive in practice.  It
>   * is theoretically possible a backend fsync might still be necessary, if
>   * the queue is full and contains no duplicate entries.  In that case, we
> - * let the backend know by returning false.
> + * let the backend know by returning false, or if 'wait' is true, then we
> + * wait for space to become available.
>   */
>  bool
> -ForwardSyncRequest(const FileTag *ftag, SyncRequestType type)
> +ForwardSyncRequest(const FileTag *ftag, SyncRequestType type, bool wait)
>  {
>       CheckpointerRequest *request;
>       bool            too_full;
> @@ -1101,9 +1104,9 @@ ForwardSyncRequest(const FileTag *ftag, SyncRequestType 
> type)
>        * backend will have to perform its own fsync request.  But before 
> forcing
>        * that to happen, we can try to compact the request queue.
>        */
> -     if (CheckpointerShmem->checkpointer_pid == 0 ||
> -             (CheckpointerShmem->num_requests >= 
> CheckpointerShmem->max_requests &&
> -              !CompactCheckpointerRequestQueue()))
> +     if (CheckpointerShmem->num_requests >= CheckpointerShmem->max_requests 
> &&
> +             !CompactCheckpointerRequestQueue() &&
> +             !wait)

Bit confused about the addition of the wait parameter / removal of the
CheckpointerShmem->checkpointer_pid check. What's that about?


> +     /*
> +      * If we still don't have enough space and the caller asked us to wait,
> +      * wait for the checkpointer to advertise that there is space.
> +      */
> +     if (CheckpointerShmem->num_requests >= CheckpointerShmem->max_requests)
> +     {
> +             
> ConditionVariablePrepareToSleep(&CheckpointerShmem->requests_not_full_cv);
> +             while (CheckpointerShmem->num_requests >=
> +                        CheckpointerShmem->max_requests)
> +             {
> +                     LWLockRelease(CheckpointerCommLock);
> +                     
> ConditionVariableSleep(&CheckpointerShmem->requests_not_full_cv,
> +                                                                
> WAIT_EVENT_FORWARD_SYNC_REQUEST);
> +                     LWLockAcquire(CheckpointerCommLock, LW_EXCLUSIVE);
> +             }
> +             ConditionVariableCancelSleep();
> +     }

Could there be a problem with a lot of backends trying to acquire
CheckpointerCommLock in exclusive mode? I'm inclined to think it's rare enough
to not worry.

Greetings,

Andres Freund


Reply via email to