I have made few more changes with the new patch. 1. Ran pgindent. 2. Instead of an atomic state variable to make only one process cache the snapshot in shared memory, I have used conditional try lwlock. With this, we have a small and reliable code. 3. Performance benchmarking
Machine - cthulhu ============== [mithun.cy@cthulhu bin]$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 8 NUMA node(s): 8 Vendor ID: GenuineIntel CPU family: 6 Model: 47 Model name: Intel(R) Xeon(R) CPU E7- 8830 @ 2.13GHz Stepping: 2 CPU MHz: 1197.000 BogoMIPS: 4266.63 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 24576K NUMA node0 CPU(s): 0,65-71,96-103 NUMA node1 CPU(s): 72-79,104-111 NUMA node2 CPU(s): 80-87,112-119 NUMA node3 CPU(s): 88-95,120-127 NUMA node4 CPU(s): 1-8,33-40 NUMA node5 CPU(s): 9-16,41-48 NUMA node6 CPU(s): 17-24,49-56 NUMA node7 CPU(s): 25-32,57-64 Server configuration: ./postgres -c shared_buffers=8GB -N 300 -c min_wal_size=15GB -c max_wal_size=20GB -c checkpoint_timeout=900 -c maintenance_work_mem=1GB -c checkpoint_completion_target=0.9 -c wal_buffers=256MB & pgbench configuration: scale_factor = 300 ./pgbench -c $threads -j $threads -T $time_for_reading -M prepared -S postgres The machine has 64 cores with this patch I can see server starts improvement after 64 clients. I have tested up to 256 clients. Which shows performance improvement nearly max 39%. Alternatively, I thought instead of storing the snapshot in a shared memory each backend can hold on to its previously computed snapshot until next commit/rollback happens in the system. We can have a global counter value associated with the snapshot when ever it is computed. Upon any new end of the transaction, the global counter will be incremented. So when a process wants a new snapshot it can compare these 2 values to check if it can use previously computed snapshot. This makes code significantly simple. With the first approach, one process has to compute and store the snapshot for every end of the transaction and others can reuse the cached the snapshot. In the second approach, every process has to re-compute the snapshot. So I am keeping with the same approach. On Mon, Jul 10, 2017 at 10:13 AM, Mithun Cy <mithun...@enterprisedb.com> wrote: > On Fri, Apr 8, 2016 at 12:13 PM, Robert Haas <robertmh...@gmail.com> wrote: >> I think that we really shouldn't do anything about this patch until >> after the CLOG stuff is settled, which it isn't yet. So I'm going to >> mark this Returned with Feedback; let's reconsider it for 9.7. > > I am updating a rebased patch have tried to benchmark again could see > good improvement in the pgbench read-only case at very high clients on > our cthulhu (8 nodes, 128 hyper thread machines) and power2 (4 nodes, > 192 hyper threads) machine. There is some issue with base code > benchmarking which is somehow not consistent so once I could figure > out what is the issue with that I will update > > -- > Thanks and Regards > Mithun C Y > EnterpriseDB: http://www.enterprisedb.com -- Thanks and Regards Mithun C Y EnterpriseDB: http://www.enterprisedb.com
cache_the_snapshot_performance.ods
Description: application/vnd.oasis.opendocument.spreadsheet
Cache_data_in_GetSnapshotData_POC_03.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers