Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Tom Lane
David Boreham <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Having malloc/free use >> an internal mutex is necessary in multi-threaded programs, but the >> backend isn't multi-threaded. > Hmm...confused. I'm not following why then there is contention for the > mutex. There isn't any content

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread David Boreham
Tom Lane wrote: Having malloc/free use an internal mutex is necessary in multi-threaded programs, but the backend isn't multi-threaded. Hmm...confused. I'm not following why then there is contention for the mutex. Surely this has to be some other mutex that is in contention, not a heap loc

Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-20 Thread Gavin M. Roy
Having done something similar recently, I would recommend that you look at adding connection pooling using pgBouncer transaction pooling between your benchmark app and PgSQL. In our application we have about 2000 clients funneling down to 30 backends and are able to sustain large transaction per

Re: [PERFORM] large number of connected connections to postgres database (v8.0)

2007-07-20 Thread Josh Berkus
Fei Liu, > It appears my multi-thread application (100 connections every 5 seconds) > is stalled when working with postgresql database server. I have limited > number of connections in my connection pool to postgresql to 20. At the > begining, connection is allocated and released from connection p

Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...

2007-07-20 Thread Josh Berkus
Marc, > Server Specifications: > -- > > Sun SPARC Enterprise M8000 Server: > > http://www.sun.com/servers/highend/m8000/specs.xml > > File system: > > http://en.wikipedia.org/wiki/ZFS There are some specific tuning parameters you need for ZFS or performance is going to suck.

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Tom Lane
"Jignesh K. Shah" <[EMAIL PROTECTED]> writes: > Yes I did see increase in context switches and CPU migrations at that > point using mpstat. So follow that up --- try to determine which lock is being contended for. There's some very crude code in the sources that you can enable with -DLWLOCK_STAT

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Jignesh K. Shah
True you cant switch off the locks since libthread has been folded into libc in Solaris 10. Anyway just to give you an idea of the increase in context switching at the break point here are the mpstat (taken at 10 second interval) on this 8-socket Sun Fire V890. The low icsw (Involuntary Cont

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Tom Lane
"Jignesh K. Shah" <[EMAIL PROTECTED]> writes: > What its saying is that there are holds/waits in trying to get locks > which are locked at Solaris user library levels called from the > postgresql functions: > For example both the following functions are hitting on the same mutex > lock 0x10059

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Jignesh K. Shah
sorry.. The are solaris mutex locks used by the postgresql process. What its saying is that there are holds/waits in trying to get locks which are locked at Solaris user library levels called from the postgresql functions: For example both the following functions are hitting on the same mutex

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Tom Lane
"Jignesh K. Shah" <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> So follow that up --- try to determine which lock is being contended >> for. There's some very crude code in the sources that you can enable >> with -DLWLOCK_STATS, but probably DTrace would be a better tool. > Using plockstat -A

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Jignesh K. Shah
Tom Lane wrote: "Jignesh K. Shah" <[EMAIL PROTECTED]> writes: Yes I did see increase in context switches and CPU migrations at that point using mpstat. So follow that up --- try to determine which lock is being contended for. There's some very crude code in the sources that you can enable wi

Re: [PERFORM] 8.2 -> 8.3 performance numbers

2007-07-20 Thread Jim Nasby
On Jul 20, 2007, at 1:03 PM, Josh Berkus wrote: Jim, Has anyone benchmarked HEAD against 8.2? I'd like some numbers to use in my OSCon lightning talk. Numbers for both with and without HOT would be even better (I know we've got HOT-specific benchmarks, but I want complete 8.2 -> 8.3 number

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Jignesh K. Shah
Yes I did see increase in context switches and CPU migrations at that point using mpstat. Regards, Jignesh Tom Lane wrote: "Jignesh K. Shah" <[EMAIL PROTECTED]> writes: There are no hard failures reported anywhere. Log min durations does show that queries are now slowing down and taking l

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Tom Lane
"Jignesh K. Shah" <[EMAIL PROTECTED]> writes: > There are no hard failures reported anywhere. Log min durations does > show that queries are now slowing down and taking longer. > OS is not swapping and also eliminated IO by putting the whole database > on /tmp Hmm. Do you see any evidence of a

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Jignesh K. Shah
I forgot to add one more piece of information.. I also tried the same test with 64-bit postgresql with 6GB shared_buffers and results are the same it drops around the same point which to me sounds like a bottleneck.. More later -Jignesh Jignesh K. Shah wrote: Awww Josh, I was just enjoying

Re: [PERFORM] Simple query showing 270 hours of CPU time

2007-07-20 Thread Tom Lane
Dan Harris <[EMAIL PROTECTED]> writes: > Since you mentioned the number of semops is distressingly high, does this > indicate a tuning problem? More like an old-version problem. We've done a lot of work on concurrent performance since 8.0.x, and most likely you are hitting one of the bottlenecks

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Jignesh K. Shah
Awww Josh, I was just enjoying the chat on the picket fence! :-) Anyway the workload is mixed (reads,writes) with simple to medium queries. The workload is known to scale well. But inorder to provide substantial input I am still trying to eliminate things that can bottleneck. Currently I hav

Re: [PERFORM] Simple query showing 270 hours of CPU time

2007-07-20 Thread Dan Harris
Tom Lane wrote: Dan Harris <[EMAIL PROTECTED]> writes: Here's the strace summary as run for a few second sample: % time seconds usecs/call callserrors syscall -- --- --- - - 97.250.671629 92 7272 s

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Josh Berkus
Greg, There's so much going on with a TPC-C kind of workload. Has anyone ever looked into quantifying scaling for more fundamental operations? There are so many places a complicated workload could get caught up that starting there is hard. I've found it's helpful to see the breaking points

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Greg Smith
On Thu, 19 Jul 2007, Josh Berkus wrote: It's a TPCC-like workload, so heavy single-row updates, and the updates/inserts are what's being measured. There's so much going on with a TPC-C kind of workload. Has anyone ever looked into quantifying scaling for more fundamental operations? There

Re: [PERFORM] 8.2 -> 8.3 performance numbers

2007-07-20 Thread Josh Berkus
Jim, Has anyone benchmarked HEAD against 8.2? I'd like some numbers to use in my OSCon lightning talk. Numbers for both with and without HOT would be even better (I know we've got HOT-specific benchmarks, but I want complete 8.2 -> 8.3 numbers). We've done it on TPCE, which is a hard benchma

Re: [PERFORM] Simple query showing 270 hours of CPU time

2007-07-20 Thread PFC
Today, I looked at 'top' on my PG server and saw a pid that reported 270 hours of CPU time. Considering this is a very simple query, I was surprised to say the least. I was about to just kill the pid, but I figured I'd try and see exactly what it was stuck doing for so long. If you are

Re: [PERFORM] Simple query showing 270 hours of CPU time

2007-07-20 Thread Tom Lane
Dan Harris <[EMAIL PROTECTED]> writes: > Here's the strace summary as run for a few second sample: > % time seconds usecs/call callserrors syscall > -- --- --- - - > 97.250.671629 92 7272 semop >1.7

[PERFORM] Simple query showing 270 hours of CPU time

2007-07-20 Thread Dan Harris
Today, I looked at 'top' on my PG server and saw a pid that reported 270 hours of CPU time. Considering this is a very simple query, I was surprised to say the least. I was about to just kill the pid, but I figured I'd try and see exactly what it was stuck doing for so long. Here's the strac

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes: > Josh Berkus <[EMAIL PROTECTED]> writes: > >> That's an interesting thought. Let me check lock counts and see if this is >> possibly the case. > > AFAIK you'd get hard failures, not slowdowns, if you ran out of lock > space entirely I assume you've check