On 2015-11-15 16:24:24 -0500, Robert Haas wrote:
> I think it needs to be adapted to use predefined constants for the
> tranche IDs instead of hardcoded values, maybe based on the attached
> tranche-constants.patch.

Yea, that part is clearly not ok. Let me look at the patch.

> Also, I think we should rip all the volatile qualifiers out of
> bufmgr.c, using the second attached patch, devolatileize-bufmgr.patch.
> The comment claiming that we need this because spinlocks aren't
> compiler barriers is obsolete.

Same here.

> @@ -457,7 +457,7 @@ CreateLWLocks(void)
>               LWLockCounter = (int *) ((char *) MainLWLockArray - 3 * 
> sizeof(int));
>               LWLockCounter[0] = NUM_FIXED_LWLOCKS;
>               LWLockCounter[1] = numLocks;
> -             LWLockCounter[2] = 1;   /* 0 is the main array */
> +             LWLockCounter[2] = LWTRANCHE_LAST_BUILTIN_ID + 1;
>       }

Man this LWLockCounter[0] stuff is just awful. Absolutely nothing that
needs to be fixed here, but here it really should be fixed someday.

>  /*
> + * We reserve a few predefined tranche IDs.  These values will never be
> + * returned by LWLockNewTrancheId.
> + */
> +#define LWTRANCHE_MAIN                                               0
> +#define LWTRANCHE_BUFFER_CONTENT                     1
> +#define LWTRANCHE_BUFFER_IO_IN_PROGRESS              2
> +#define LWTRANCHE_LAST_BUILTIN_ID                    
> LWTRANCHE_BUFFER_IO_IN_PROGRESS

Nitpick: I'm inclined to use an enum to avoid having to adjust the last
builtin id when adding a new builtin tranche.


Besides that minor thing I think this works for me. We might
independently want something making adding non-builtin tranches easier,
but that's really just independent.

> From 9e7f9219b5e752da46be0e26a0be074191ae8f62 Mon Sep 17 00:00:00 2001
> From: Robert Haas <rh...@postgresql.org>
> Date: Sun, 15 Nov 2015 10:24:03 -0500
> Subject: [PATCH 1/3] De-volatile-ize.

I very strongly think this should be done. It's painful having to
cast-away volatile, and it de-optimizes a fair bit of performance
relevant code.

>  /* local state for StartBufferIO and related functions */
>  /* local state for StartBufferIO and related functions */
> -static volatile BufferDesc *InProgressBuf = NULL;
> +static BufferDesc *InProgressBuf = NULL;
>  static bool IsForInput;
>  
>  /* local state for LockBufferForCleanup */
> -static volatile BufferDesc *PinCountWaitBuf = NULL;
> +static BufferDesc *PinCountWaitBuf = NULL;

Hm. These could also be relevant for sigsetjmp et al. Haven't found
relevant code tho.

> -     volatile BufferDesc *bufHdr;
> +     BufferDesc *bufHdr;
>       Block           bufBlock;

Looks mis-indented now, similarly in a bunch of other places. Maybe
pg-indent afterwards?


>               /*
>                * Header spinlock is enough to examine BM_DIRTY, see comment in
> @@ -1707,7 +1707,7 @@ BufferSync(int flags)
>       num_written = 0;
>       while (num_to_scan-- > 0)
>       {
> -             volatile BufferDesc *bufHdr = GetBufferDescriptor(buf_id);
> +             BufferDesc *bufHdr = GetBufferDescriptor(buf_id);
>

This case has some theoretical behavioural implications: As
bufHdr->flags is accessed without a spinlock the compiler might re-use
an older value. But that's ok: a) there's no old value it really could
use b) the whole race-condition exists anyway, and the comment in the
body explains why that's ok.

>  BlockNumber
>  BufferGetBlockNumber(Buffer buffer)
>  {
> -     volatile BufferDesc *bufHdr;
> +     BufferDesc *bufHdr;
>  
>       Assert(BufferIsPinned(buffer));
>  
> @@ -2346,7 +2346,7 @@ void
>  BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
>                        BlockNumber *blknum)
>  {
> -     volatile BufferDesc *bufHdr;
> +     BufferDesc *bufHdr;

>       /* Do the same checks as BufferGetBlockNumber. */
>       Assert(BufferIsPinned(buffer));
> @@ -2382,7 +2382,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, 
> ForkNumber *forknum,
>   * as the second parameter.  If not, pass NULL.
>   */
>  static void
> -FlushBuffer(volatile BufferDesc *buf, SMgrRelation reln)
> +FlushBuffer(BufferDesc *buf, SMgrRelation reln)
>  {

>       XLogRecPtr      recptr;
>       ErrorContextCallback errcallback;
> @@ -2520,7 +2520,7 @@ RelationGetNumberOfBlocksInFork(Relation relation, 
> ForkNumber forkNum)
>  bool
>  BufferIsPermanent(Buffer buffer)
>  {
> -     volatile BufferDesc *bufHdr;
> +     BufferDesc *bufHdr;


These require that the buffer is pinned (a barrier implying
operation). The pin prevents the contents from changing in a relevant
manner, the barrier implied in the pin forces the core's view to be
non-stale.


> @@ -2613,7 +2613,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, 
> ForkNumber forkNum,
>  
>       for (i = 0; i < NBuffers; i++)
>       {
> -             volatile BufferDesc *bufHdr = GetBufferDescriptor(i);
> +             BufferDesc *bufHdr = GetBufferDescriptor(i);

>               /*
>                * We can make this a tad faster by prechecking the buffer tag 
> before
> @@ -2703,7 +2703,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, 
> int nnodes)
>       for (i = 0; i < NBuffers; i++)
>       {
>               RelFileNode *rnode = NULL;
> -             volatile BufferDesc *bufHdr = GetBufferDescriptor(i);
> +             BufferDesc *bufHdr = GetBufferDescriptor(i);
>  
>               /*
>                * As in DropRelFileNodeBuffers, an unlocked precheck should be 
> safe
> @@ -2767,7 +2767,7 @@ DropDatabaseBuffers(Oid dbid)
>  
>       for (i = 0; i < NBuffers; i++)
>       {
> -             volatile BufferDesc *bufHdr = GetBufferDescriptor(i);
> +             BufferDesc *bufHdr = GetBufferDescriptor(i);

> @@ -2863,7 +2863,7 @@ void
>  FlushRelationBuffers(Relation rel)
>  {
>       int                     i;
> -     volatile BufferDesc *bufHdr;
> +     BufferDesc *bufHdr;
>  
>       /* Open rel at the smgr level if not already done */
>       RelationOpenSmgr(rel);
> @@ -2955,7 +2955,7 @@ void
>  FlushDatabaseBuffers(Oid dbid)
>  {
>       int                     i;
> -     volatile BufferDesc *bufHdr;
> +     BufferDesc *bufHdr;

These all are correct based on the premise that some heavy lock -
implying a barrier - prevents new blocks with relevant tags from being
loaded into s_b. And it's fine if some other backend concurrently writes
out the buffer before we do - we'll notice that in the re-check after


So, looks good to me.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to