On Thu, Apr 17, 2014 at 11:23:24AM +0200, Andres Freund wrote: > On 2014-04-16 19:18:02 -0400, Bruce Momjian wrote: > > On Thu, Feb 6, 2014 at 09:40:32AM +0100, Andres Freund wrote: > > > On 2014-02-05 12:36:42 -0500, Robert Haas wrote: > > > > >> It may well be that your proposal is spot on. But I'd like to see > > > > >> some > > > > >> data-structure-by-data-structure measurements, rather than assuming > > > > >> that > > > > >> alignment must be a good thing. > > > > > > > > > > I am fine with just aligning BufferDescriptors properly. That has > > > > > clearly shown massive improvements. > > > > > > > > I thought your previous idea of increasing BUFFERALIGN to 64 bytes had > > > > a lot to recommend it. > > > > > > Good. > > > > > > I wonder if we shouldn't move that bit of logic: > > > if (size >= BUFSIZ) > > > newStart = BUFFERALIGN(newStart); > > > out of ShmemAlloc() and instead have a ShmemAllocAligned() and > > > ShmemInitStructAligned() that does it. So we can sensibly can control it > > > per struct. > > > > > > > But that doesn't mean it doesn't need testing. > > > > > > I feel the need here, to say that I never said it doesn't need testing > > > and never thought it didn't... > > > > Where are we on this? > > It needs somebody with time to evaluate possible performance regressions > - I personally won't have time to look into this in detail before pgcon.
I am doing performance testing to try to complete this item. I used the first attached patch to report which structures are 64-byte aligned: 64-byte shared memory alignment of Control File: 0 64-byte shared memory alignment of XLOG Ctl: 1 64-byte shared memory alignment of CLOG Ctl: 0 64-byte shared memory alignment of CommitTs Ctl: 0 64-byte shared memory alignment of CommitTs shared: 0 64-byte shared memory alignment of SUBTRANS Ctl: 1 64-byte shared memory alignment of MultiXactOffset Ctl: 1 64-byte shared memory alignment of MultiXactMember Ctl: 1 64-byte shared memory alignment of Shared MultiXact State: 1 64-byte shared memory alignment of Buffer Descriptors: 1 64-byte shared memory alignment of Buffer Blocks: 1 64-byte shared memory alignment of Shared Buffer Lookup Table: 1 64-byte shared memory alignment of Buffer Strategy Status: 1 64-byte shared memory alignment of LOCK hash: 0 64-byte shared memory alignment of PROCLOCK hash: 0 64-byte shared memory alignment of Fast Path Strong Relation Lock Data: 0 64-byte shared memory alignment of PREDICATELOCKTARGET hash: 0 64-byte shared memory alignment of PREDICATELOCK hash: 0 64-byte shared memory alignment of PredXactList: 0 64-byte shared memory alignment of SERIALIZABLEXID hash: 1 64-byte shared memory alignment of RWConflictPool: 1 64-byte shared memory alignment of FinishedSerializableTransactions: 0 64-byte shared memory alignment of OldSerXid SLRU Ctl: 1 64-byte shared memory alignment of OldSerXidControlData: 1 64-byte shared memory alignment of Proc Header: 0 64-byte shared memory alignment of Proc Array: 0 64-byte shared memory alignment of Backend Status Array: 0 64-byte shared memory alignment of Backend Application Name Buffer: 0 64-byte shared memory alignment of Backend Client Host Name Buffer: 0 64-byte shared memory alignment of Backend Activity Buffer: 0 64-byte shared memory alignment of Prepared Transaction Table: 0 64-byte shared memory alignment of Background Worker Data: 0 64-byte shared memory alignment of shmInvalBuffer: 1 64-byte shared memory alignment of PMSignalState: 0 64-byte shared memory alignment of ProcSignalSlots: 0 64-byte shared memory alignment of Checkpointer Data: 0 64-byte shared memory alignment of AutoVacuum Data: 0 64-byte shared memory alignment of Wal Sender Ctl: 0 64-byte shared memory alignment of Wal Receiver Ctl: 0 64-byte shared memory alignment of BTree Vacuum State: 0 64-byte shared memory alignment of Sync Scan Locations List: 0 64-byte shared memory alignment of Async Queue Control: 0 64-byte shared memory alignment of Async Ctl: 0 Many of these are 64-byte aligned, including Buffer Descriptors. I tested pgbench with these commands: $ pgbench -i -s 95 pgbench $ pgbench -S -c 95 -j 95 -t 100000 pgbench on a 16-core Xeon server and got 84k tps. I then applied another patch, attached, which causes all the structures to be non-64-byte aligned, but got the same tps number. Can someone test these patches on an AMD CPU and see if you see a difference? Thanks. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c new file mode 100644 index 2ea2216..25b9eba *** a/src/backend/storage/ipc/shmem.c --- b/src/backend/storage/ipc/shmem.c *************** ShmemInitStruct(const char *name, Size s *** 413,418 **** --- 413,419 ---- " \"%s\" (%zu bytes requested)", name, size))); } + fprintf(stderr, "64-byte shared memory alignment of %s: %d\n", name, ((int64)structPtr % 64) == 0); result->size = size; result->location = structPtr; }
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c new file mode 100644 index 2ea2216..cc1ac1f *** a/src/backend/storage/ipc/shmem.c --- b/src/backend/storage/ipc/shmem.c *************** ShmemInitStruct(const char *name, Size s *** 327,332 **** --- 327,335 ---- ShmemIndexEnt *result; void *structPtr; + // if (strcmp(name, "Buffer Descriptors") == 0) + size += 32; + LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); if (!ShmemIndex) *************** ShmemInitStruct(const char *name, Size s *** 413,418 **** --- 416,424 ---- " \"%s\" (%zu bytes requested)", name, size))); } + // if (strcmp(name, "Buffer Descriptors") == 0) + structPtr = (void *)((int64)structPtr + 4); + fprintf(stderr, "64-byte shared memory alignment of %s: %d\n", name, ((int64)structPtr % 64) == 0); result->size = size; result->location = structPtr; }
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers