Re: [PERFORM] 60 core performance with 9.3

2014-08-14 Thread Mark Kirkwood
On 15/08/14 06:18, Josh Berkus wrote: Mark, Is the 60-core machine using some of the Intel chips which have 20 hyperthreaded virtual cores? If so, I've been seeing some performance issues on these processors. I'm currently doing a side-by-side hyperthreading on/off test. Hi Josh, The board

Re: [PERFORM] 60 core performance with 9.3

2014-08-14 Thread Josh Berkus
Mark, Is the 60-core machine using some of the Intel chips which have 20 hyperthreaded virtual cores? If so, I've been seeing some performance issues on these processors. I'm currently doing a side-by-side hyperthreading on/off test. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com

Re: [PERFORM] 60 core performance with 9.3

2014-08-11 Thread Mark Kirkwood
On 01/08/14 09:38, Alvaro Herrera wrote: Matt Clarkson wrote: The LWLOCK_STATS below suggest that ProcArrayLock might be the main source of locking that's causing throughput to take a dive as the client count increases beyond the core count. Any thoughts or comments on these results are welc

Re: [PERFORM] 60 core performance with 9.3

2014-07-31 Thread Alvaro Herrera
Matt Clarkson wrote: > The LWLOCK_STATS below suggest that ProcArrayLock might be the main > source of locking that's causing throughput to take a dive as the client > count increases beyond the core count. > Any thoughts or comments on these results are welcome! Do these results change if you u

Re: [PERFORM] 60 core performance with 9.3

2014-07-30 Thread Matt Clarkson
I've been assisting Mark with the benchmarking of these new servers. The drop off in both throughput and CPU utilisation that we've been observing as the client count increases has let me to investigate which lwlocks are dominant at different client counts. I've recompiled postgres with Andres L

Re: [PERFORM] 60 core performance with 9.3

2014-07-30 Thread Mark Kirkwood
On 31/07/14 00:47, Tomas Vondra wrote: On 30 Červenec 2014, 14:39, Tom Lane wrote: "Tomas Vondra" writes: On 30 ??ervenec 2014, 3:44, Mark Kirkwood wrote: While these numbers look great in the middle range (12-96 clients), then benefit looks to be tailing off as client numbers increase. Also

Re: [PERFORM] 60 core performance with 9.3

2014-07-30 Thread Mark Kirkwood
Hi Tomas, Unfortunately I think you are mistaken - disabling the stats collector (i.e. track_counts = off) means that autovacuum has no idea about when/if it needs to start a worker (as it uses those counts to decide), and hence you lose all automatic vacuum and analyze as a result. With res

Re: [PERFORM] 60 core performance with 9.3

2014-07-30 Thread Tomas Vondra
On 30 Červenec 2014, 14:39, Tom Lane wrote: > "Tomas Vondra" writes: >> On 30 ??ervenec 2014, 3:44, Mark Kirkwood wrote: >>> While these numbers look great in the middle range (12-96 clients), >>> then >>> benefit looks to be tailing off as client numbers increase. Also >>> running >>> with no sta

Re: [PERFORM] 60 core performance with 9.3

2014-07-30 Thread Tom Lane
"Tomas Vondra" writes: > On 30 Červenec 2014, 3:44, Mark Kirkwood wrote: >> While these numbers look great in the middle range (12-96 clients), then >> benefit looks to be tailing off as client numbers increase. Also running >> with no stats (and hence no auto vacuum or analyze) is way too scary!

Re: [PERFORM] 60 core performance with 9.3

2014-07-30 Thread Tomas Vondra
On 30 Červenec 2014, 3:44, Mark Kirkwood wrote: > > While these numbers look great in the middle range (12-96 clients), then > benefit looks to be tailing off as client numbers increase. Also running > with no stats (and hence no auto vacuum or analyze) is way too scary! I assume you've disabled s

Re: [PERFORM] 60 core performance with 9.3

2014-07-29 Thread Mark Kirkwood
On 17/07/14 11:58, Mark Kirkwood wrote: Trying out with numa_balancing=0 seemed to get essentially the same performance. Similarly wrapping postgres startup with --interleave. All this made me want to try with numa *really* disabled. So rebooted the box with "numa=off" appended to the kernel c

Re: [PERFORM] 60 core performance with 9.3

2014-07-21 Thread Kevin Grittner
Mark Kirkwood wrote: > On 12/07/14 01:19, Kevin Grittner wrote: >> >> It might be worth a test using a cpuset to interleave OS cache and >> the NUMA patch I submitted to the current CF to see whether this is >> getting into territory where the patch makes a bigger difference. >> I would expect it

Re: [PERFORM] 60 core performance with 9.3

2014-07-16 Thread Mark Kirkwood
On 12/07/14 01:19, Kevin Grittner wrote: It might be worth a test using a cpuset to interleave OS cache and the NUMA patch I submitted to the current CF to see whether this is getting into territory where the patch makes a bigger difference. I would expect it to do much better than using numactl

Re: [PERFORM] 60 core performance with 9.3

2014-07-16 Thread Mark Kirkwood
On 11/07/14 20:22, Andres Freund wrote: On 2014-07-11 12:40:15 +1200, Mark Kirkwood wrote: Full report http://paste.ubuntu.com/886/ # 8.82%postgres [kernel.kallsyms][k] _raw_spin_lock_irqsave | --- _raw_spin_lock_irqsave

Re: [PERFORM] 60 core performance with 9.3

2014-07-11 Thread Kevin Grittner
Mark Kirkwood wrote: > On 11/07/14 20:22, Andres Freund wrote: >> So, the majority of the time is spent in numa page migration. >> Can you disable numa_balancing? I'm not sure if your kernel >> version does that at runtime or whether you need to reboot. >> The kernel.numa_balancing sysctl might w

Re: [PERFORM] 60 core performance with 9.3

2014-07-11 Thread Mark Kirkwood
On 11/07/14 20:22, Andres Freund wrote: On 2014-07-11 12:40:15 +1200, Mark Kirkwood wrote: Postgres 9.4 beta rwlock patch pgbench scale = 2000 On that scale - that's bigger than shared_buffers IIRC - I'd not expect the patch to make much of a difference. Right - we did test with it bigger

Re: [PERFORM] 60 core performance with 9.3

2014-07-11 Thread Andres Freund
On 2014-07-11 12:40:15 +1200, Mark Kirkwood wrote: > On 01/07/14 22:13, Andres Freund wrote: > >On 2014-07-01 21:48:35 +1200, Mark Kirkwood wrote: > >>- cherry picking the last 5 commits into 9.4 branch and building a package > >>from that and retesting: > >> > >>Clients | 9.4 tps 60 cores (rwlock)

Re: [PERFORM] 60 core performance with 9.3

2014-07-10 Thread Mark Kirkwood
On 01/07/14 22:13, Andres Freund wrote: On 2014-07-01 21:48:35 +1200, Mark Kirkwood wrote: - cherry picking the last 5 commits into 9.4 branch and building a package from that and retesting: Clients | 9.4 tps 60 cores (rwlock) +-- 6 | 70189 12 | 12889

Re: [PERFORM] 60 core performance with 9.3

2014-07-01 Thread Andres Freund
On 2014-07-01 21:48:35 +1200, Mark Kirkwood wrote: > On 27/06/14 21:19, Andres Freund wrote: > >On 2014-06-27 14:28:20 +1200, Mark Kirkwood wrote: > >>My feeling is spinlock or similar, 'perf top' shows > >> > >>kernel find_busiest_group > >>kernel _raw_spin_lock > >> > >>as the top time users. > >

Re: [PERFORM] 60 core performance with 9.3

2014-07-01 Thread Mark Kirkwood
On 01/07/14 21:48, Mark Kirkwood wrote: [1] from git://git.postgresql.org/git/users/andresfreund/postgres.git, commits: 4b82477dcaf81ad7b0c102f4b66e479a5eb9504a 10d72b97f108b6002210ea97a414076a62302d4e 67ffebe50111743975d54782a3a94b15ac4e755f fe686ed18fe132021ee5e557c67cc4d7c50a1ada f2378dc2fa5b

Re: [PERFORM] 60 core performance with 9.3

2014-07-01 Thread Mark Kirkwood
On 27/06/14 21:19, Andres Freund wrote: On 2014-06-27 14:28:20 +1200, Mark Kirkwood wrote: My feeling is spinlock or similar, 'perf top' shows kernel find_busiest_group kernel _raw_spin_lock as the top time users. Those don't tell that much by themselves, could you do a hierarchical profile?

Re: [PERFORM] 60 core performance with 9.3

2014-06-27 Thread Mark Kirkwood
On 27/06/14 21:19, Andres Freund wrote: On 2014-06-27 14:28:20 +1200, Mark Kirkwood wrote: My feeling is spinlock or similar, 'perf top' shows kernel find_busiest_group kernel _raw_spin_lock as the top time users. Those don't tell that much by themselves, could you do a hierarchical profile?

Re: [PERFORM] 60 core performance with 9.3

2014-06-27 Thread Andres Freund
On 2014-06-27 14:28:20 +1200, Mark Kirkwood wrote: > My feeling is spinlock or similar, 'perf top' shows > > kernel find_busiest_group > kernel _raw_spin_lock > > as the top time users. Those don't tell that much by themselves, could you do a hierarchical profile? I.e. perf record -ga? That'll a

Re: [PERFORM] 60 core performance with 9.3

2014-06-26 Thread Mark Kirkwood
On 27/06/14 14:01, Scott Marlowe wrote: On Thu, Jun 26, 2014 at 5:49 PM, Mark Kirkwood wrote: I have a nice toy to play with: Dell R920 with 60 cores and 1TB ram [1]. The context is the current machine in use by the customer is a 32 core one, and due to growth we are looking at something large

Re: [PERFORM] 60 core performance with 9.3

2014-06-26 Thread Scott Marlowe
On Thu, Jun 26, 2014 at 5:49 PM, Mark Kirkwood wrote: > I have a nice toy to play with: Dell R920 with 60 cores and 1TB ram [1]. > > The context is the current machine in use by the customer is a 32 core one, > and due to growth we are looking at something larger (hence 60 cores). > > Some initial

[PERFORM] 60 core performance with 9.3

2014-06-26 Thread Mark Kirkwood
I have a nice toy to play with: Dell R920 with 60 cores and 1TB ram [1]. The context is the current machine in use by the customer is a 32 core one, and due to growth we are looking at something larger (hence 60 cores). Some initial tests show similar pgbench read only performance to what Rob