On Mon, Jul 11, 2022 at 9:49 PM Robert Haas <robertmh...@gmail.com> wrote:
>

> It also makes me wonder why we're using macros rather than static
> inline functions in buf_internals.h. I wonder whether we could do
> something like this, for example, and keep InvalidForkNumber as -1:
>
> static inline ForkNumber
> BufTagGetForkNum(BufferTag *tagPtr)
> {
>     int8 ret;
>
>     StaticAssertStmt(MAX_FORKNUM <= INT8_MAX);
>     ret = (int8) ((tagPtr->relForkDetails[0] >> BUFFERTAG_RELNUMBER_BITS);
>     return (ForkNumber) ret;
> }
>
> Even if we don't use that particular trick, I think we've generally
> been moving toward using static inline functions rather than macros,
> because it provides better type-safety and the code is often easier to
> read. Maybe we should also approach it that way here. Or even commit a
> preparatory patch replacing the existing macros with inline functions.
> Or maybe it's best to leave it alone, not sure.

I think it make sense to convert existing macros as well, I have
attached a patch for the same,
>
> > I had those changes in v7-0003, now I have merged with 0002.  This has
> > assert check while replaying the WAL for smgr create and smgr
> > truncate, and while during normal path when allocating the new
> > relfilenumber we are asserting for any existing file.
>
> I think a test-and-elog might be better. Most users won't be running
> assert-enabled builds, but this seems worth checking regardless.

IMHO the recovery time asserts we can convert to elog but one which we
are doing after each GetNewRelFileNumber is better to keep as an
assert as we are doing the file access so it can be costly?

> > I have done some performance tests, with very small values I can see a
> > lot of wait events for RelFileNumberGen but with bigger numbers like
> > 256 or 512 it is not really bad.  See results at the end of the
> > mail[1]
>
> It's a little hard to interpret these results because you don't say
> how often you were checking the wait events, or how often the
> operation took to complete. I suppose we can guess the relative time
> scale from the number of Activity events: if there were 190
> WalWriterMain events observed, then the time to complete the operation
> is probably 190 times how often you were checking the wait events, but
> was that every second or every half second or every tenth of a second?

I am executing it after every 0.5 sec using below script in psql
\t
select wait_event_type, wait_event from pg_stat_activity where pid !=
pg_backend_pid()
\watch 0.5

And running test for 60 sec
./pgbench -c 32 -j 32 -T 60 -f create_script.sql -p 54321  postgres

$ cat create_script.sql
select create_table(100);

// function body 'create_table'
CREATE OR REPLACE FUNCTION create_table(count int) RETURNS void AS $$
DECLARE
  relname varchar;
  pid int;
  i   int;
BEGIN
  SELECT pg_backend_pid() INTO pid;
  relname := 'test_' || pid;
  FOR i IN 1..count LOOP
    EXECUTE format('CREATE TABLE %s(a int)', relname);

    EXECUTE format('DROP TABLE %s', relname);
  END LOOP;
END;
$$ LANGUAGE plpgsql;



> > I have done these changes during GetNewRelFileNumber() this required
> > to track the last logged record pointer as well but I think this looks
> > clean.  With this I can see some reduction in RelFileNumberGen wait
> > event[1]
>
> I find the code you wrote here a little bit magical. I believe it
> depends heavily on choosing to issue the new WAL record when we've
> exhausted exactly 50% of the available space. I suggest having two
> constants, one of which is the number of relfilenumber values per WAL
> record, and the other of which is the threshold for issuing a new WAL
> record. Maybe something like RFN_VALUES_PER_XLOG and
> RFN_NEW_XLOG_THRESHOLD, or something. And then work code that works
> correctly for any value of RFN_NEW_XLOG_THRESHOLD between 0 (don't log
> new RFNs until old allocation is completely exhausted) and
> RFN_VALUES_PER_XLOG - 1 (log new RFNs after using just 1 item from the
> previous allocation). That way, if in the future someone decides to
> change the constant values, they can do that and the code still works.

ok



-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 553a6b185ee7270bf206387421e50b48c7a8b97b Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilip.ku...@enterprisedb.com>
Date: Tue, 12 Jul 2022 17:10:04 +0530
Subject: [PATCH v1] Convert buf_internal.h macros to static inline functions

Readability wise inline functions are better compared to macros and this
will also help to write cleaner and readable code for 64-bit relfilenode
because as part of that patch we will have to do some complex bitwise
operation so doing that inside the inline function will be cleaner.
---
 src/backend/storage/buffer/buf_init.c |   2 +-
 src/backend/storage/buffer/bufmgr.c   |  16 ++--
 src/backend/storage/buffer/localbuf.c |  12 +--
 src/include/storage/buf_internals.h   | 156 +++++++++++++++++++++-------------
 4 files changed, 111 insertions(+), 75 deletions(-)

diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index 2862e9e..55f646d 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -116,7 +116,7 @@ InitBufferPool(void)
 		{
 			BufferDesc *buf = GetBufferDescriptor(i);
 
-			CLEAR_BUFFERTAG(buf->tag);
+			CLEAR_BUFFERTAG(&buf->tag);
 
 			pg_atomic_init_u32(&buf->state, 0);
 			buf->wait_backend_pgprocno = INVALID_PGPROCNO;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index e257ae2..0ec5891 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -515,7 +515,7 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln,
 	Assert(BlockNumberIsValid(blockNum));
 
 	/* create a tag so we can lookup the buffer */
-	INIT_BUFFERTAG(newTag, smgr_reln->smgr_rlocator.locator,
+	INIT_BUFFERTAG(&newTag, smgr_reln->smgr_rlocator.locator,
 				   forkNum, blockNum);
 
 	/* determine its hash code and partition lock ID */
@@ -632,7 +632,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
 
 	ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
 	ReservePrivateRefCountEntry();
-	INIT_BUFFERTAG(tag, rlocator, forkNum, blockNum);
+	INIT_BUFFERTAG(&tag, rlocator, forkNum, blockNum);
 
 	if (BufferIsLocal(recent_buffer))
 	{
@@ -640,7 +640,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
 		buf_state = pg_atomic_read_u32(&bufHdr->state);
 
 		/* Is it still valid and holding the right tag? */
-		if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(tag, bufHdr->tag))
+		if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(&tag, &bufHdr->tag))
 		{
 			/* Bump local buffer's ref and usage counts. */
 			ResourceOwnerRememberBuffer(CurrentResourceOwner, recent_buffer);
@@ -669,7 +669,7 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
 		else
 			buf_state = LockBufHdr(bufHdr);
 
-		if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(tag, bufHdr->tag))
+		if ((buf_state & BM_VALID) && BUFFERTAGS_EQUAL(&tag, &bufHdr->tag))
 		{
 			/*
 			 * It's now safe to pin the buffer.  We can't pin first and ask
@@ -1124,7 +1124,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 	uint32		buf_state;
 
 	/* create a tag so we can lookup the buffer */
-	INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+	INIT_BUFFERTAG(&newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
 
 	/* determine its hash code and partition lock ID */
 	newHash = BufTableHashCode(&newTag);
@@ -1507,7 +1507,7 @@ retry:
 	buf_state = LockBufHdr(buf);
 
 	/* If it's changed while we were waiting for lock, do nothing */
-	if (!BUFFERTAGS_EQUAL(buf->tag, oldTag))
+	if (!BUFFERTAGS_EQUAL(&buf->tag, &oldTag))
 	{
 		UnlockBufHdr(buf, buf_state);
 		LWLockRelease(oldPartitionLock);
@@ -1539,7 +1539,7 @@ retry:
 	 * linear scans of the buffer array don't think the buffer is valid.
 	 */
 	oldFlags = buf_state & BUF_FLAG_MASK;
-	CLEAR_BUFFERTAG(buf->tag);
+	CLEAR_BUFFERTAG(&buf->tag);
 	buf_state &= ~(BUF_FLAG_MASK | BUF_USAGECOUNT_MASK);
 	UnlockBufHdr(buf, buf_state);
 
@@ -3356,7 +3356,7 @@ FindAndDropRelFileLocatorBuffers(RelFileLocator rlocator, ForkNumber forkNum,
 		uint32		buf_state;
 
 		/* create a tag so we can lookup the buffer */
-		INIT_BUFFERTAG(bufTag, rlocator, forkNum, curBlock);
+		INIT_BUFFERTAG(&bufTag, rlocator, forkNum, curBlock);
 
 		/* determine its hash code and partition lock ID */
 		bufHash = BufTableHashCode(&bufTag);
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 41a0807..6618ef1 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -68,7 +68,7 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
 	BufferTag	newTag;			/* identity of requested block */
 	LocalBufferLookupEnt *hresult;
 
-	INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+	INIT_BUFFERTAG(&newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
 
 	/* Initialize local buffers if first request in this session */
 	if (LocalBufHash == NULL)
@@ -117,7 +117,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
 	bool		found;
 	uint32		buf_state;
 
-	INIT_BUFFERTAG(newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
+	INIT_BUFFERTAG(&newTag, smgr->smgr_rlocator.locator, forkNum, blockNum);
 
 	/* Initialize local buffers if first request in this session */
 	if (LocalBufHash == NULL)
@@ -131,7 +131,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
 	{
 		b = hresult->id;
 		bufHdr = GetLocalBufferDescriptor(b);
-		Assert(BUFFERTAGS_EQUAL(bufHdr->tag, newTag));
+		Assert(BUFFERTAGS_EQUAL(&bufHdr->tag, &newTag));
 #ifdef LBDEBUG
 		fprintf(stderr, "LB ALLOC (%u,%d,%d) %d\n",
 				smgr->smgr_rlocator.locator.relNumber, forkNum, blockNum, -b - 1);
@@ -253,7 +253,7 @@ LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
 		if (!hresult)			/* shouldn't happen */
 			elog(ERROR, "local buffer hash table corrupted");
 		/* mark buffer invalid just in case hash insert fails */
-		CLEAR_BUFFERTAG(bufHdr->tag);
+		CLEAR_BUFFERTAG(&bufHdr->tag);
 		buf_state &= ~(BM_VALID | BM_TAG_VALID);
 		pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
 	}
@@ -354,7 +354,7 @@ DropRelFileLocatorLocalBuffers(RelFileLocator rlocator, ForkNumber forkNum,
 			if (!hresult)		/* shouldn't happen */
 				elog(ERROR, "local buffer hash table corrupted");
 			/* Mark buffer invalid */
-			CLEAR_BUFFERTAG(bufHdr->tag);
+			CLEAR_BUFFERTAG(&bufHdr->tag);
 			buf_state &= ~BUF_FLAG_MASK;
 			buf_state &= ~BUF_USAGECOUNT_MASK;
 			pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
@@ -398,7 +398,7 @@ DropRelFileLocatorAllLocalBuffers(RelFileLocator rlocator)
 			if (!hresult)		/* shouldn't happen */
 				elog(ERROR, "local buffer hash table corrupted");
 			/* Mark buffer invalid */
-			CLEAR_BUFFERTAG(bufHdr->tag);
+			CLEAR_BUFFERTAG(&bufHdr->tag);
 			buf_state &= ~BUF_FLAG_MASK;
 			buf_state &= ~BUF_USAGECOUNT_MASK;
 			pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index aded5e8..d8ddb54 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -95,28 +95,32 @@ typedef struct buftag
 	BlockNumber blockNum;		/* blknum relative to begin of reln */
 } BufferTag;
 
-#define CLEAR_BUFFERTAG(a) \
-( \
-	(a).rlocator.spcOid = InvalidOid, \
-	(a).rlocator.dbOid = InvalidOid, \
-	(a).rlocator.relNumber = InvalidRelFileNumber, \
-	(a).forkNum = InvalidForkNumber, \
-	(a).blockNum = InvalidBlockNumber \
-)
-
-#define INIT_BUFFERTAG(a,xx_rlocator,xx_forkNum,xx_blockNum) \
-( \
-	(a).rlocator = (xx_rlocator), \
-	(a).forkNum = (xx_forkNum), \
-	(a).blockNum = (xx_blockNum) \
-)
-
-#define BUFFERTAGS_EQUAL(a,b) \
-( \
-	RelFileLocatorEquals((a).rlocator, (b).rlocator) && \
-	(a).blockNum == (b).blockNum && \
-	(a).forkNum == (b).forkNum \
-)
+static inline void
+CLEAR_BUFFERTAG(BufferTag *tag)
+{
+	tag->rlocator.spcOid = InvalidOid;
+	tag->rlocator.dbOid = InvalidOid;
+	tag->rlocator.relNumber = InvalidRelFileNumber;
+	tag->forkNum = InvalidForkNumber;
+	tag->blockNum = InvalidBlockNumber;
+}
+
+static inline void
+INIT_BUFFERTAG(BufferTag *tag, RelFileLocator rlocator,
+			   ForkNumber forkNum, BlockNumber blockNum)
+{
+	tag->rlocator = rlocator;
+	tag->forkNum = forkNum;
+	tag->blockNum = blockNum;
+}
+
+static inline bool
+BUFFERTAGS_EQUAL(BufferTag *tag1, BufferTag *tag2)
+{
+	return RelFileLocatorEquals(tag1->rlocator, tag2->rlocator) &&
+		(tag1->blockNum == tag2->blockNum) &&
+		(tag1->forkNum == tag2->forkNum);
+}
 
 /*
  * The shared buffer mapping table is partitioned to reduce contention.
@@ -124,13 +128,24 @@ typedef struct buftag
  * hash code with BufTableHashCode(), then apply BufMappingPartitionLock().
  * NB: NUM_BUFFER_PARTITIONS must be a power of 2!
  */
-#define BufTableHashPartition(hashcode) \
-	((hashcode) % NUM_BUFFER_PARTITIONS)
-#define BufMappingPartitionLock(hashcode) \
-	(&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + \
-		BufTableHashPartition(hashcode)].lock)
-#define BufMappingPartitionLockByIndex(i) \
-	(&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + (i)].lock)
+static inline uint32
+BufTableHashPartition(uint32 hashcode)
+{
+	return hashcode % NUM_BUFFER_PARTITIONS;
+}
+
+static inline LWLock *
+BufMappingPartitionLock(uint32 hashcode)
+{
+	return &MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET +
+							BufTableHashPartition(hashcode)].lock;
+}
+
+static inline LWLock *
+BufMappingPartitionLockByIndex(uint32 index)
+{
+	return &MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + index].lock;
+}
 
 /*
  *	BufferDesc -- shared descriptor/state data for a single shared buffer.
@@ -220,37 +235,6 @@ typedef union BufferDescPadded
 	char		pad[BUFFERDESC_PAD_TO_SIZE];
 } BufferDescPadded;
 
-#define GetBufferDescriptor(id) (&BufferDescriptors[(id)].bufferdesc)
-#define GetLocalBufferDescriptor(id) (&LocalBufferDescriptors[(id)])
-
-#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)
-
-#define BufferDescriptorGetIOCV(bdesc) \
-	(&(BufferIOCVArray[(bdesc)->buf_id]).cv)
-#define BufferDescriptorGetContentLock(bdesc) \
-	((LWLock*) (&(bdesc)->content_lock))
-
-extern PGDLLIMPORT ConditionVariableMinimallyPadded *BufferIOCVArray;
-
-/*
- * The freeNext field is either the index of the next freelist entry,
- * or one of these special values:
- */
-#define FREENEXT_END_OF_LIST	(-1)
-#define FREENEXT_NOT_IN_LIST	(-2)
-
-/*
- * Functions for acquiring/releasing a shared buffer header's spinlock.  Do
- * not apply these to local buffers!
- */
-extern uint32 LockBufHdr(BufferDesc *desc);
-#define UnlockBufHdr(desc, s)	\
-	do {	\
-		pg_write_barrier(); \
-		pg_atomic_write_u32(&(desc)->state, (s) & (~BM_LOCKED)); \
-	} while (0)
-
-
 /*
  * The PendingWriteback & WritebackContext structure are used to keep
  * information about pending flush requests to be issued to the OS.
@@ -276,11 +260,63 @@ typedef struct WritebackContext
 
 /* in buf_init.c */
 extern PGDLLIMPORT BufferDescPadded *BufferDescriptors;
+extern PGDLLIMPORT ConditionVariableMinimallyPadded *BufferIOCVArray;
 extern PGDLLIMPORT WritebackContext BackendWritebackContext;
 
 /* in localbuf.c */
 extern PGDLLIMPORT BufferDesc *LocalBufferDescriptors;
 
+
+static inline BufferDesc *
+GetBufferDescriptor(uint32 id)
+{
+	return &(BufferDescriptors[id]).bufferdesc;
+}
+
+static inline BufferDesc *
+GetLocalBufferDescriptor(uint32 id)
+{
+	return &LocalBufferDescriptors[id];
+}
+
+static inline Buffer
+BufferDescriptorGetBuffer(BufferDesc *bdesc)
+{
+	return (Buffer) (bdesc->buf_id + 1);
+}
+
+static inline ConditionVariable *
+BufferDescriptorGetIOCV(BufferDesc *bdesc)
+{
+	return &(BufferIOCVArray[bdesc->buf_id]).cv;
+}
+
+static inline LWLock *
+BufferDescriptorGetContentLock(BufferDesc *bdesc)
+{
+	return (LWLock *) (&bdesc->content_lock);
+}
+
+/*
+ * The freeNext field is either the index of the next freelist entry,
+ * or one of these special values:
+ */
+#define FREENEXT_END_OF_LIST	(-1)
+#define FREENEXT_NOT_IN_LIST	(-2)
+
+/*
+ * Functions for acquiring/releasing a shared buffer header's spinlock.  Do
+ * not apply these to local buffers!
+ */
+extern uint32 LockBufHdr(BufferDesc *desc);
+
+static inline void
+UnlockBufHdr(BufferDesc *desc, uint32 buf_state)
+{
+	pg_write_barrier();
+	pg_atomic_write_u32(&desc->state, buf_state & (~BM_LOCKED));
+}
+
 /* in bufmgr.c */
 
 /*
-- 
1.8.3.1

Reply via email to