[PERFORM] connections slowing everything down?

2008-04-21 Thread Adrian Moisey

Hi

# ps -ef | grep idle | wc -l
87
# ps -ef | grep SELECT | wc -l
5


I have 2 web servers which connect to PGPool which connects to our 
postgres db.  I have noticed that idle connections seem to take up CPU 
and RAM (according to top).  Could this in any way cause things to slow 
down?


--
Adrian Moisey
Systems Administrator | CareerJunction | Your Future Starts Here.
Web: www.careerjunction.co.za | Email: [EMAIL PROTECTED]
Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Performance of the Materialize operator in a query plan

2008-04-21 Thread Viktor Rosenfeld

Hi,

I'm having trouble understanding the cost of the Materialize  
operator.  Consider the following plan:


Nested Loop  (cost=2783.91..33217.37 rows=78634 width=44) (actual  
time=77.164..2478.973 rows=309 loops=1)
Join Filter: ((rank2.pre <= rank5.pre) AND (rank5.pre <=  
rank2.post))
->  Nested Loop  (cost=0.00..12752.06 rows=1786 width=33)  
(actual time=0.392..249.255 rows=9250 loops=1)

  .
->  Materialize  (cost=2783.91..2787.87 rows=396 width=22)  
(actual time=0.001..0.072 rows=587 loops=9250)
  ->  Nested Loop  (cost=730.78..2783.51 rows=396  
width=22) (actual time=7.637..27.030 rows=587 loops=1)



The cost of the inner-most Nested Loop is 27 ms, but the total cost of  
the Materialize operator is 666 ms (9250 loops * 0.072 ms per  
iteration).  So, Materialize introduces more than 10x overhead.  Is  
this the cost of writing the table to temporary storage or am I  
misreading the query plan output?


Furthermore, the outer table is almost 20x as big as the inner table.   
Wouldn't the query be much faster by switching the inner with the  
outer table?  I have switched off GEQO, so I Postgres should find the  
optimal query plan.


Cheers,
Viktor

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] connections slowing everything down?

2008-04-21 Thread Erik Jones


On Apr 21, 2008, at 4:50 AM, Adrian Moisey wrote:


Hi

# ps -ef | grep idle | wc -l
87
# ps -ef | grep SELECT | wc -l
5


I have 2 web servers which connect to PGPool which connects to our  
postgres db.  I have noticed that idle connections seem to take up  
CPU and RAM (according to top).  Could this in any way cause things  
to slow down?


Dependant on how much memory you have in your system, yes.  You can  
fix the constant use of memory by idle connections by adjusting the  
child_life_time setting in your pgpool.conf file.  The default if 5  
minutes which a bit long.  Try dropping that down to 20 or 30 seconds.


Erik Jones

DBA | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] connections slowing everything down?

2008-04-21 Thread Adrian Moisey

Hi


# ps -ef | grep idle | wc -l
87

[...]

I have 2 web servers which connect to PGPool which connects to our 
postgres db.  I have noticed that idle connections seem to take up CPU 
and RAM (according to top).  Could this in any way cause things to 
slow down?


Dependant on how much memory you have in your system, yes.  You can fix 
the constant use of memory by idle connections by adjusting the 
child_life_time setting in your pgpool.conf file.  The default if 5 
minutes which a bit long.  Try dropping that down to 20 or 30 seconds.


We have 32GBs.  If I get it to close the connections faster, will that 
actually help?  Is there a way i can figure it out?



--
Adrian Moisey
Systems Administrator | CareerJunction | Your Future Starts Here.
Web: www.careerjunction.co.za | Email: [EMAIL PROTECTED]
Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance of the Materialize operator in a query plan

2008-04-21 Thread Tom Lane
Viktor Rosenfeld <[EMAIL PROTECTED]> writes:
> I'm having trouble understanding the cost of the Materialize  
> operator.  Consider the following plan:

> Nested Loop  (cost=2783.91..33217.37 rows=78634 width=44) (actual  
> time=77.164..2478.973 rows=309 loops=1)
>  Join Filter: ((rank2.pre <= rank5.pre) AND (rank5.pre <=  
> rank2.post))
>  ->  Nested Loop  (cost=0.00..12752.06 rows=1786 width=33)  
> (actual time=0.392..249.255 rows=9250 loops=1)
>.
>  ->  Materialize  (cost=2783.91..2787.87 rows=396 width=22)  
> (actual time=0.001..0.072 rows=587 loops=9250)
>->  Nested Loop  (cost=730.78..2783.51 rows=396  
> width=22) (actual time=7.637..27.030 rows=587 loops=1)
>  

> The cost of the inner-most Nested Loop is 27 ms, but the total cost of  
> the Materialize operator is 666 ms (9250 loops * 0.072 ms per  
> iteration).  So, Materialize introduces more than 10x overhead.

Not hardly.  Had the Materialize not been there, we'd have executed
the inner nestloop 9250 times, for a total cost of 9250 * 27ms.
(Actually it might have been less due to cache effects, but still
a whole lot more than 0.072 per iteration.)

These numbers say that it's taking the Materialize about 120 microsec
per row returned, which seems a bit high to me considering that the
data is just sitting in a tuplestore.  I surmise that you are using
a machine with slow gettimeofday() and that's causing the measurement
overhead to be high.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] connections slowing everything down?

2008-04-21 Thread Erik Jones


On Apr 21, 2008, at 9:15 AM, Adrian Moisey wrote:


Hi


# ps -ef | grep idle | wc -l
87

[...]

I have 2 web servers which connect to PGPool which connects to our  
postgres db.  I have noticed that idle connections seem to take up  
CPU and RAM (according to top).  Could this in any way cause  
things to slow down?
Dependant on how much memory you have in your system, yes.  You can  
fix the constant use of memory by idle connections by adjusting the  
child_life_time setting in your pgpool.conf file.  The default if 5  
minutes which a bit long.  Try dropping that down to 20 or 30  
seconds.


We have 32GBs.  If I get it to close the connections faster, will  
that actually help?  Is there a way i can figure it out?


First, sorry, I gave you the wrong config setting, I meant  
connection_life_time.  child_life_time is the lifetime of an idle pool  
process on the client machine and the connection_life_time is the  
lifetime of an idle connection (i.e. no transaction running) on the  
server.  With the default connection_life_time of 5 minutes it's  
easily possible to keep an connection open indefinitely.  Imagine a  
client gets a connection and runs a single query, then nothing happens  
on that connection for 4:30 minutes at which point another single  
query is run.  If that pattern continues that connection will never be  
relinquished.  While the point of a pool is to cut down on the number  
of connections that need to be established, you don't necessarily want  
to go the extreme and never tear down connections as that will cause a  
degradation in available server resources.  With a smaller, but not 0,  
connection life time, connections will stay open and available during  
periods of high work rates from the client, but will be relinquished  
when there isn't as much to do.


Without more details on what exactly is happening on your system I  
can't say for sure that this is your fix.  Are you tracking/monitoring  
your server's free memory?  If not I'd suggest getting either Cacti or  
Monit in place to monitor system stats such as free memory (using  
vmstat), system IO (using iostat), db transaction rates (using db  
queries).  Then you'll be able to draw correlations between  
application behavior (slowness, etc) and actual system numbers.  I  
know that I had issues with connections being held open for long times  
(using the default 300s) causing our free memory to gradually decrease  
over the day and resetting our pools would clear it out so there was a  
direct cause and effect relationship there.  When I dropped the  
connection_life_time to 30s the problem went away.


Erik Jones

DBA | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Vacuum settings

2008-04-21 Thread Guillaume Cottenceau
dforums  writes:

> 2Q) Here are my settings for vacuum, could you help me to optimise
> those settings, at the moment the vacuum analyse sent every night is
> taking around 18 h to run, which slow down the server performance.

It's a lot of time for a daily job (and it is interesting to
vacuum hot tables more often than daily). With typical settings,
it's probable that autovacuum will run forever (e.g. at the end
of run, another run will already be needed). You should first
verify you don't have bloat in your tables (a lot of dead rows) -
bloat can be created by too infrequent vacuuming and too low FSM
settings[1]. To fix the bloat, you can dump and restore your DB
if you can afford interrupting your application, or use VACUUM
FULL if you can afford blocking your application (disclaimer:
many posters here passionately disgust VACUUM FULL and keep on
suggesting the use of CLUSTER).

Ref: 
[1] to say whether you have bloat, you can use
contrib/pgstattuple (you can easily add it to a running
PostgreSQL). If the free_percent reported for interesting
tables is large, and free_space is large compared to 8K, then
you have bloat;

another way is to dump your database, restore it onto another
database, issue VACUUM VERBOSE on a given table on both
databases (in live, and on the restore) and compare the
reported number of pages needed. The difference is the
bloat.

  live=# VACUUM VERBOSE interesting_table;
  [...]
  INFO:  "interesting_table": found 408 removable, 64994 nonremovable row 
versions in 4395 pages

  restored=# VACUUM VERBOSE interesting_table;
  [...]
  INFO:  "interesting_table": found 0 removable, 64977 nonremovable row 
versions in 628 pages

=> (4395-628)*8/1024.0 MB of bloat

(IIRC, this VACUUM output is for 7.4, it has changed a bit
since then)

-- 
Guillaume Cottenceau

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Vacuum settings

2008-04-21 Thread Alvaro Herrera
dforums wrote:
> Hello,
>
> I need wo advice on vacuum settings.
>
> I have a quad core X5355 @ 2.66GHz with 8 Go of memory
>
> 1Q) Why autovaccum  does not work, I have set the value to on in  
> postgresql.conf but when the server start it's still off 

You need to turn stats_row_level on too.

> # - Cost-Based Vacuum Delay -
>
> vacuum_cost_delay = 5   # 0-1000 milliseconds
> vacuum_cost_page_hit = 1000 # 0-1 credits
> vacuum_cost_page_miss = 1000# 0-1 credits
> vacuum_cost_page_dirty = 120# 0-1 credits
> vacuum_cost_limit = 20  # 0-1 credits

The cost are all too high and the limit too low.  I suggest resetting to
the default values, and figuring out a reasonable delay limit (your
current 5ms value seems a bit too low, but I think in most cases 10ms is
the practical limit due to sleep granularity in the kernel.  In any
case, since the other values are all wrong I suggest just setting it to
10ms and seeing what happens).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: [HACKERS] [COMMITTERS] pgsql: Fix TransactionIdIsCurrentTransactionId() to use binary search

2008-04-21 Thread Robert Treat
On Thursday 27 March 2008 17:11, Tom Lane wrote:
> Robert Treat <[EMAIL PROTECTED]> writes:
> > On Sunday 16 March 2008 22:18, Tom Lane wrote:
> > > > > Fix TransactionIdIsCurrentTransactionId() to use binary 
> > > > > search instead 
> > > > > of linear search when checking child-transaction XIDs.
> >
> > > > Are there any plans to backpatch this into REL8_3_STABLE?
> > >
> > >  No.
> > >
> > > > It looks like I am
> > > > hitting a pretty serious performance regression on 8.3 with a stored
> > > > procedure that grabs a pretty big recordset, and loops through doing
> > > > insertupdate on unique failures.  The procedure get progressivly
> > > > slower the more records involved... and dbx shows me stuck in
> > > > TransactionIdIsCurrentTransactionId().
> > >
> > > If you can convince me it's a regression I might reconsider, but I
> > > rather doubt that 8.2 was better,
> > > 

> > Well, I can't speak for 8.2, but I have a second system crunching the
> > same data using the same function on 8.1 (on lesser hardware in fact),
> > and it doesn't have these type of issues.
>
> If you can condense it to a test case that is worse on 8.3 than 8.1,
> I'm willing to listen...

I spent some time trying to come up with a test case, but had no luck.  Dtrace 
showed that the running process was calling this function rather excessively; 
sample profiling for 30 seconds would look like this: 

FUNCTIONCOUNT   PCNT

postgres`LockBuffer10   0.0%
postgres`slot_deform_tuple 11   0.0%
postgres`ExecEvalScalarVar 11   0.0%
postgres`ExecMakeFunctionResultNoSets  13   0.0%
postgres`IndexNext 14   0.0%
postgres`slot_getattr  15   0.0%
postgres`LWLockRelease 20   0.0%
postgres`index_getnext 55   0.1%
postgres`TransactionIdIsCurrentTransactionId40074  99.4%

But I saw similar percentages on the 8.1 machine, so I am not convinced this 
is where the problem is.  Unfortunatly (in some respects) the problem went 
away up untill this morning, so I haven't been looking at it since the above 
exchange.  I'm still open to the idea that something inside 
TransactionIdIsCurrentTransactionId could have changed to make things worse 
(in addition to cpu, the process does consume a significant amount of 
memory... prstat shows:

 PID USERNAME  SIZE   RSS STATE  PRI NICE  TIME  CPU PROCESS/NLWP
 3844 postgres 1118M 1094M cpu3500   6:25:48  12% postgres/1

I do wonder if the number of rows being worked on is significant in some 
way... by looking in the job log for the running procedure (we use 
autonoumous logging in this function), I can see that it has a much larger 
number of rows to be processed, so perhaps there is simply a tipping point 
that is reached which causes it to stop performing... still it would be 
curious that I never saw this behavior on 8.1

= current job
 elapsed | status
-+
 00:00:00.042895 | OK/starting with 2008-04-21 03:20:03
 00:00:00.892663 | OK/processing 487291 hits up until 2008-04-21 05:20:03
 05:19:26.595508 | ??/Processed 7 aggregated rows so far
(3 rows)

= yesterdays run
| elapsed | status
+-+
| 00:00:00.680222 | OK/starting with 2008-04-20 04:20:02
| 00:00:00.409331 | OK/processing 242142 hits up until 2008-04-20 05:20:04
| 00:25:02.306736 | OK/Processed 35936 aggregated rows
| 00:00:00.141179 | OK/
(4 rows)

Unfortunatly I don't have the 8.1 system to bang on anymore for this, (though 
anecdotaly speaking, I never saw this behavior in 8.1) however I do now have 
a parallel 8.3 system crunching the data, and it is showing the same symptom 
(yes, 2 8.3 servers, crunching the same data, both bogged down now), so I do 
feel this is something specific to 8.3.  

I am mostly wondering if anyone else has encountered behavior like this on 8.3 
(large sets of insertupdate exception block in plpgsql bogging down), or 
if anyone has any thoughts on which direction I should poke at it from here. 
TIA.

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Re: [HACKERS] [COMMITTERS] pgsql: Fix TransactionIdIsCurrentTransactionId() to use binary search

2008-04-21 Thread Alvaro Herrera
Robert Treat wrote:

> Unfortunatly I don't have the 8.1 system to bang on anymore for this, (though 
> anecdotaly speaking, I never saw this behavior in 8.1) however I do now have 
> a parallel 8.3 system crunching the data, and it is showing the same symptom 
> (yes, 2 8.3 servers, crunching the same data, both bogged down now), so I do 
> feel this is something specific to 8.3.  
> 
> I am mostly wondering if anyone else has encountered behavior like this on 
> 8.3 
> (large sets of insertupdate exception block in plpgsql bogging down), or 
> if anyone has any thoughts on which direction I should poke at it from here. 
> TIA.

Perhaps what you could do is backpatch the change and see if the problem
goes away.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Group by more efficient than distinct?

2008-04-21 Thread PFC
On Sun, 20 Apr 2008 17:15:36 +0200, Francisco Reyes  
<[EMAIL PROTECTED]> wrote:



PFC writes:

- If you process up to some percentage of your RAM worth of data,  
hashing  is going to be a lot faster


Thanks for the excellent breakdown and explanation. I will try and get  
sizes of the tables in question and how much memory the machines have.


	Actually, the memory used by the hash depends on the number of distinct  
values, not the number of rows which are processed...

Consider :

SELECT a GROUP BY a
SELECT a,count(*) GROUP BY a

	In both cases the hash only holds discinct values. So if you have 1  
million rows to process but only 10 distinct values of "a", the hash will  
only contain those 10 values (and the counts), so it will be very small  
and fast, it will absorb a huge seq scan without problem. If however, you  
have (say) 100 million distinct values for a, using a hash would be a bad  
idea. As usual, divide the size of your RAM by the number of concurrent  
connections or something.
	Note that "a" could be a column, several columns, anything, the size of  
the hash will be proportional to the number of distinct values, ie. the  
number of rows returned by the query, not the number of rows processed  
(read) by the query. Same with hash joins etc, that's why when you join a  
very small table to a large one Postgres likes to use seq scan + hash join  
on the small table.




- If you need DISTINCT ON, well, you're stuck with the Sort
- So, for the time being, you can replace DISTINCT with GROUP BY...


Have seen a few of those already on some code (new job..) so for those  
it is a matter of having a good disk subsystem?


	Depends on your RAM, sorting in RAM is always faster than sorting on disk  
of course, unless you eat all the RAM and trash the other processes.  
Tradeoffs...




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Group by more efficient than distinct?

2008-04-21 Thread Mark Mielke

PFC wrote:
Actually, the memory used by the hash depends on the number of 
distinct values, not the number of rows which are processed...

Consider :

SELECT a GROUP BY a
SELECT a,count(*) GROUP BY a

In both cases the hash only holds discinct values. So if you have 
1 million rows to process but only 10 distinct values of "a", the hash 
will only contain those 10 values (and the counts), so it will be very 
small and fast, it will absorb a huge seq scan without problem. If 
however, you have (say) 100 million distinct values for a, using a 
hash would be a bad idea. As usual, divide the size of your RAM by the 
number of concurrent connections or something.
Note that "a" could be a column, several columns, anything, the 
size of the hash will be proportional to the number of distinct 
values, ie. the number of rows returned by the query, not the number 
of rows processed (read) by the query. Same with hash joins etc, 
that's why when you join a very small table to a large one Postgres 
likes to use seq scan + hash join on the small table.


This surprises me - hash values are lossy, so it must still need to 
confirm against the real list of values, which at a minimum should 
require references to the rows to check against?


Is PostgreSQL doing something beyond my imagination? :-)

Cheers,
mark

--
Mark Mielke <[EMAIL PROTECTED]>


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Group by more efficient than distinct?

2008-04-21 Thread Mark Mielke

Mark Mielke wrote:

PFC wrote:
Actually, the memory used by the hash depends on the number of 
distinct values, not the number of rows which are processed...

Consider :

SELECT a GROUP BY a
SELECT a,count(*) GROUP BY a

In both cases the hash only holds discinct values. So if you have 
1 million rows to process but only 10 distinct values of "a", the 
hash will only contain those 10 values (and the counts), so it will 
be very small and fast, it will absorb a huge seq scan without 
problem. If however, you have (say) 100 million distinct values for 
a, using a hash would be a bad idea. As usual, divide the size of 
your RAM by the number of concurrent connections or something.
Note that "a" could be a column, several columns, anything, the 
size of the hash will be proportional to the number of distinct 
values, ie. the number of rows returned by the query, not the number 
of rows processed (read) by the query. Same with hash joins etc, 
that's why when you join a very small table to a large one Postgres 
likes to use seq scan + hash join on the small table.


This surprises me - hash values are lossy, so it must still need to 
confirm against the real list of values, which at a minimum should 
require references to the rows to check against?


Is PostgreSQL doing something beyond my imagination? :-)


Hmmm... You did say distinct values, so I can see how that would work 
for distinct. What about seq scan + hash join, though? To complete the 
join, wouldn't it need to have a reference to each of the rows to join 
against? If there is 20 distinct values and 200 rows in the small table 
- wouldn't it need 200 references to be stored?


Cheers,
mark

--
Mark Mielke <[EMAIL PROTECTED]>


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance