Re: [PERFORM] Why is the query not using the index for sorting?

2009-11-23 Thread Matthew Wakeling

On Sun, 22 Nov 2009, Jonathan Blitz wrote:

I have a table with a number of columns.
 
I perform
 
Select *
from table
order by a,b
 
There is an index on a,b which is clustered (as well as indexes on a and b 
alone).
I have issued the cluster and anyalze commands.


Did you analyse *after* creating the index and clustering, or before?

Matthew

--
[About NP-completeness] These are the problems that make efficient use of
the Fairy Godmother.-- Computer Science Lecturer
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query times change by orders of magnitude as DB ages

2009-11-23 Thread Matthew Wakeling

On Sun, 22 Nov 2009, Richard Neill wrote:

Worse still, doing a cluster of most of the tables and vacuum full analyze


Why are you doing a vacuum full? That command is not meant to be used 
except in the most unusual of circumstances, as it causes bloat to 
indexes.


If you have run a cluster command, then running vacuum full will make the 
table and index layout worse, not better.


Matthew

--
Riker: Our memory pathways have become accustomed to your sensory input.
Data:  I understand - I'm fond of you too, Commander. And you too Counsellor

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why is the query not using the index for sorting?

2009-11-23 Thread Jonathan Blitz
Definitely after.

Jonathan 

-Original Message-
From: pgsql-performance-ow...@postgresql.org
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Matthew
Wakeling
Sent: Monday, November 23, 2009 1:00 PM
To: Jonathan Blitz
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Why is the query not using the index for sorting?

On Sun, 22 Nov 2009, Jonathan Blitz wrote:
> I have a table with a number of columns.
>  
> I perform
>  
> Select *
> from table
> order by a,b
>  
> There is an index on a,b which is clustered (as well as indexes on a and b
alone).
> I have issued the cluster and anyalze commands.

Did you analyse *after* creating the index and clustering, or before?

Matthew

--
 [About NP-completeness] These are the problems that make efficient use of
 the Fairy Godmother.-- Computer Science Lecturer
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.709 / Virus Database: 270.14.76/2517 - Release Date: 11/22/09
21:40:00


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] View based upon function won't use index on joins

2009-11-23 Thread Robert Haas
2009/11/20 Jonathan Foy :
> Shy of just throwing a trigger in the
> table to actually populate a second table with the same data solely for
> reporting purposes,

That's what I would do in your situation, FWIW.  Query optimization is
a hard problem even under the best of circumstances; getting the
planner to DTRT with a crazy schema is - well, really hard.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query times change by orders of magnitude as DB ages

2009-11-23 Thread Robert Haas
On Sun, Nov 22, 2009 at 10:31 AM, Richard Neill  wrote:
> Dear All,
>
> Thanks for your help earlier with the previous question. I wonder if I might
> ask another.
>
>
> We have various queries that need to run, of which I'm going to focus on 2,
> "vox" and "du_report".
>
> Both of them are extremely sensitive to the precise values of
> random_page_cost and seq_page_cost. Experimentally, I've used:
>
>  A:  seq_page_cost = 0.25;  random_page_cost = 0.75
>  B:  seq_page_cost = 0.5;  random_page_cost = 2
>  C: seq_page_cost = 1;  random_page_cost = 4
>
> (and a few in between).
>
>
> If I pick the wrong one, then either vox becomes 2 orders of magnitude
> slower (22ms -> 3.5 seconds), or du_report becomes 10x slower. I can't use
> the same setting for both.
>
> So, as a very ugly hack, I've tuned the sweet spots for each query.
> Vox normally sits at B; du_report at C.
>
>
> Now, the real killer is that the position of that sweet spot changes over
> time as the DB ages over a few days (even though autovacuum is on).
>
> Worse still, doing a cluster of most of the tables and vacuum full analyze
> made most of the queries respond much better, but the vox query became very
> slow again, until I set it to A (which, a few days ago, did not work well).
>
>
> *  Why is the query planner so precisely sensitive to the combination of
> page costs and time since last vacuum full?

It sounds like your tables are getting bloated.  If you have
autovacuum turned on, this shouldn't be happening.  What sort of
workload is this?  What PG version?

> * Why is it that what improves one query can make another get so much worse?

Because it changes the plan you get.

> * Is there any way I can nail the query planner to a particular query plan,
> rather than have it keep changing its mind?

See other responses.

> * Is it normal to keep having to tune the query-planner's settings, or
> should it be possible to set it once, and leave it?

Leave it.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance degrade running on multicore computer

2009-11-23 Thread Dave Youatt
On 01/-10/-28163 11:59 AM, afancy wrote:
> Hi,
>
> My PostgreSQL server has two CPUs (OS: Fedora 11), each with 4 cores.
> Total is 8cores.  Now I have several clients running at the same time
> to do insert and update on the same table, each client having its own
> connection.  I have made  two testing with  clients running in
> parallel to load 20M data in total. Each testing, the data is split
> evenly by the client number such that each client only loads a piece
> of data.
>
> 1) Long transaction: A client does the commit at the end of loading.
> Result: Each postgres consumes 95% CPU. The more clients run in
> parallel, the slower the total runing time is (when 8 clients, it is
> slowest). However, I expect the more clients run in parallel, it
> should be faster to load all the data.
>
> 2) Short transaction: I set the clients to do a commit on loading
> every 500 records. Results:  Each postgres consumes about 50%CPU. Now
> the total running is as what i have expected; the more clients run in
> parallel, the faster it is (when 8 clients, it is fastest).
>
> Could anybody help to why when I do the long transaction with 8
> clients, it is slowest? How can I solve this problem?  As I don't want
> to use the 2), in which I have to set the commit size each time.
>
> Thanks a lot!!
>
> -Afancy
>

Since you have 2 cpus, you may want to try setting the processor
affinity for postgres (server and client programs) to the 4 cores on one
of the cpus (taskset command on linux).  Here's an excerpt from a
modified /etc/init.d/postgresql:

   $SU -l postgres -c "taskset -c 4-7 $PGENGINE/postmaster -p '$PGPORT' -D 
'$PGDATA' ${PGOPTS} &" >> "$PGLOG" 2>&1 < /dev/null 


Thanks to Greg Smith to pointing this out when we had a similar issue
w/a 2-cpu server.
NB:  This was with postgresql 8.3.  Don't know if 8.4+ has built-in
processor affinity.

(Apologies in advance for the email formatting.)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [GENERAL] Strange performance degradation

2009-11-23 Thread Lorenzo Allegrucci

Tom Lane wrote:

Lorenzo Allegrucci  writes:

So, my main question is.. how can just a plain simple restart of postgres
restore the original performance (3% cpu time)?


Are you killing off any long-running transactions when you restart?


After three days of patient waiting it looks like the common
' in transaction' problem..

[sorry for >80 cols]

19329 ?S 15:54 /usr/lib/postgresql/8.3/bin/postgres -D 
/var/lib/postgresql/8.3/main -c 
config_file=/etc/postgresql/8.3/main/postgresql.conf
19331 ?Ss 3:40  \_ postgres: writer process
19332 ?Ss 0:42  \_ postgres: wal writer process
19333 ?Ss15:01  \_ postgres: stats collector process
19586 ?Ss   114:00  \_ postgres: forinicom weadmin [local] idle
20058 ?Ss 0:00  \_ postgres: forinicom weadmin [local] idle
13136 ?Ss 0:00  \_ postgres: forinicom weadmin 192.168.4.253(43721) 
idle in transaction

My app is a Django webapp, maybe there's some bug in the Django+psycopg2 stack?

Anyway, how can I get rid those "idle in transaction" processes?
Can I just kill -15 them or is there a less drastic way to do it?

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Best possible way to insert and get returned ids

2009-11-23 Thread Jason Dictos
Question:

Is an INSERT command with a SELECT statement in the RETURNING * parameter 
faster than say an INSERT and then a SELECT? Does the RETURNING * parameter 
simply amount to a normal SELECT command on the added rows? We need to 
basically insert a lot of rows as fast as possible, and get the ids that were 
added.  The number of rows we are inserting is dynamic and is not of fixed 
length.

Thanks,
-Jason


--
Check out the Barracuda Spam & Virus Firewall - offering the fastest
virus & malware protection in the industry: www.barracudanetworks.com/spam



Re: [PERFORM] [GENERAL] Strange performance degradation

2009-11-23 Thread Tom Lane
Bill Moran  writes:
> In response to Lorenzo Allegrucci :
>> Tom Lane wrote:
>>> Are you killing off any long-running transactions when you restart?

>> Anyway, how can I get rid those "idle in transaction" processes?
>> Can I just kill -15 them or is there a less drastic way to do it?

> Connections idle in transaction do not cause performance problems simply
> by being there, at least not when there are so few.

The idle transaction doesn't eat resources in itself.  What it does do
is prevent VACUUM from reclaiming dead rows that are recent enough that
they could still be seen by the idle transaction.  The described
behavior sounds to me like other transactions are wasting lots of cycles
scanning through dead-but-not-yet-reclaimed rows.  There are some other
things that also get slower as the window between oldest and newest
active XID gets wider.

(8.4 alleviates this problem in many cases, but the OP said he was
running 8.3.)

> If you -TERM them, any uncommitted data will be rolled back, which may
> not be what you want.  Don't -KILL them, that will upset the postmaster.

-TERM isn't an amazingly safe thing either in 8.3.  Don't you have a way
to kill the client-side sessions?

> My answer to your overarching question is that you need to dig deeper to
> find the real cause of your problem, you're just starting to isolate it.

Agreed, what you really want to do is find and fix the transaction leak
on the client side.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best possible way to insert and get returned ids

2009-11-23 Thread Scott Marlowe
On Mon, Nov 23, 2009 at 1:53 PM, Jason Dictos  wrote:
> Question:
>
> Is an INSERT command with a SELECT statement in the RETURNING * parameter
> faster than say an INSERT and then a SELECT? Does the RETURNING * parameter
> simply amount to a normal SELECT command on the added rows? We need to
> basically insert a lot of rows as fast as possible, and get the ids that
> were added.  The number of rows we are inserting is dynamic and is not of
> fixed length.

Well, if you do an insert, then a select, how can you tell, with that
select, which rows you just inserted?  how can you be sure they're not
somebody elses?

Insert returning is fantastic for this type of thing.  The beauty of
it is that it returns a SET if you insert multiple rows.  And, if
you've got two insert threads running, and one inserts to a sequence a
set of rows with pk values of 10,11,13,15,18,20 while another thread
inserts to the same table and creates a set of rows with pk values of
12,14,16,17,19 then those are the two sets you'll get back with
returning.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] query optimization

2009-11-23 Thread Faheem Mitha


Hi everybody,

I've got two queries that needs optimizing. Actually, there are others, 
but these are pretty representative.


You can see the queries and the corresponding plans at

http://bulldog.duhs.duke.edu/~faheem/snpdb/opt.pdf

or

http://bulldog.duhs.duke.edu/~faheem/snpdb/opt.tex

if you prefer text (latex file, effectively text in this case)

The background to this is at 
http://bulldog.duhs.duke.edu/~faheem/snpdb/diag.pdf


If more details are required, let me know and I can add them. I'd 
appreciate suggestions about how to make these queries go faster.


Please CC this email address on any replies.

   Regards, Faheem.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] query optimization

2009-11-23 Thread Thom Brown
2009/11/23 Faheem Mitha 

>
> Hi everybody,
>
> I've got two queries that needs optimizing. Actually, there are others, but
> these are pretty representative.
>
> You can see the queries and the corresponding plans at
>
> http://bulldog.duhs.duke.edu/~faheem/snpdb/opt.pdf
>
> or
>
> http://bulldog.duhs.duke.edu/~faheem/snpdb/opt.tex
>
> if you prefer text (latex file, effectively text in this case)
>
> The background to this is at
> http://bulldog.duhs.duke.edu/~faheem/snpdb/diag.pdf
>
> If more details are required, let me know and I can add them. I'd
> appreciate suggestions about how to make these queries go faster.
>
> Please CC this email address on any replies.
>
>   Regards, Faheem.
>
>
>
Hi Faheem,

There appears to be a discrepancy between the 2 PDFs you provided.  One says
you're using PostgreSQL 8.3, and the other shows you using common table
expressions, which are only available in 8.4+.

Thom


Re: [PERFORM] query optimization

2009-11-23 Thread Faheem Mitha



On Mon, 23 Nov 2009, Thom Brown wrote:


Hi Faheem,

There appears to be a discrepancy between the 2 PDFs you provided.  One 
says you're using PostgreSQL 8.3, and the other shows you using common 
table expressions, which are only available in 8.4+.


Yes, sorry. I'm using Postgresql 8.4. I guess I should go through diag.pdf 
and make sure all the information is current. Thanks for pointing out my 
error.


  Regards, Faheem.
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] query optimization

2009-11-23 Thread marcin mank
On Tue, Nov 24, 2009 at 12:49 AM, Faheem Mitha  wrote:
>
> Yes, sorry. I'm using Postgresql 8.4. I guess I should go through diag.pdf
> and make sure all the information is current. Thanks for pointing out my
> error.
>

excellent report!

about the copy problem: You seem to have created the primary key
before doing the copy (at least that`s what the dump before copy
says). This is bad. Create it after the copy.

Greetings
Marcin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] query optimization

2009-11-23 Thread Sebastian Jörgensen
How often are the tables you query from updated?

Rgds
Sebastian

On Tue, Nov 24, 2009 at 12:52 AM, marcin mank  wrote:

> On Tue, Nov 24, 2009 at 12:49 AM, Faheem Mitha 
> wrote:
> >
> > Yes, sorry. I'm using Postgresql 8.4. I guess I should go through
> diag.pdf
> > and make sure all the information is current. Thanks for pointing out my
> > error.
> >
>
> excellent report!
>
> about the copy problem: You seem to have created the primary key
> before doing the copy (at least that`s what the dump before copy
> says). This is bad. Create it after the copy.
>
> Greetings
> Marcin
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] query optimization

2009-11-23 Thread Faheem Mitha



On Tue, 24 Nov 2009, Sebastian Jörgensen wrote:


How often are the tables you query from updated?


Quite rarely. Once in a while. The large tables, eg. geno, are basically 
static.


 Regards, Faheem.


Rgds
Sebastian

On Tue, Nov 24, 2009 at 12:52 AM, marcin mank  wrote:
  On Tue, Nov 24, 2009 at 12:49 AM, Faheem Mitha  
wrote:
  >
  > Yes, sorry. I'm using Postgresql 8.4. I guess I should go through 
diag.pdf
  > and make sure all the information is current. Thanks for pointing out my
  > error.
  >

excellent report!

about the copy problem: You seem to have created the primary key
before doing the copy (at least that`s what the dump before copy
says). This is bad. Create it after the copy.

Greetings
Marcin

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Query is slow when executing in procedure

2009-11-23 Thread ramasubramanian
Dear all,
The query is slow when executing in the stored procedure(it is taking 
around 1 minute). when executing as a sql it is taking 4 seconds.
basically i am selecting the varchar column which contain 4000 character. We 
have as iindex on the table. We have analyzed the table also. What could be the 
reason. How to improve it?

Thanks in Advance
Ram

Re: [PERFORM] Query is slow when executing in procedure

2009-11-23 Thread Pavel Stehule
2009/11/24 ramasubramanian :
> Dear all,
>     The query is slow when executing in the stored procedure(it is taking
> around 1 minute). when executing as a sql it is taking 4 seconds.
> basically i am selecting the varchar column which contain 4000 character. We
> have as iindex on the table. We have analyzed the table also. What could be
> the reason. How to improve it?

Hello

use a dynamic query - plpgsql uses prepared statements. It use plans
generated without knowledge of real params. Sometime it should to do
performance problem. EXECUTE statement (in plpgsql) uses new plan for
every call (and generated with knowledge of real params) - so it is a
solution for you.

http://www.postgresql.org/docs/8.4/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN

Regards
Pavel Stehule



>
> Thanks in Advance
> Ram

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query is slow when executing in procedure

2009-11-23 Thread A. Kretschmer
In response to ramasubramanian :
> Dear all,
> The query is slow when executing in the stored procedure(it is taking
> around 1 minute). when executing as a sql it is taking 4 seconds.
> basically i am selecting the varchar column which contain 4000 character. We
> have as iindex on the table. We have analyzed the table also. What could be 
> the
> reason. How to improve it?

The reason is hard to guess, because you don't provide enough
informations like the function code.

My guess:

You calls the function with a parameter, and the planner isn't able to
chose a fast plan because he doesn't know the parameter. That's why he
is choosen a seq-scan. You can rewrite your function to using dynamical
execute a string that contains your sql to force the planner search an
optimal plan for your actual parameter.

But yes, that's only a wild guess (and sorry about my english...)

Please, show us the table and the function-code.


Regards, Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query is slow when executing in procedure

2009-11-23 Thread ramasubramanian

Thanks a lot Pavel . i will try it .

- Original Message - 
From: "Pavel Stehule" 

To: "ramasubramanian" 
Cc: 
Sent: Tuesday, November 24, 2009 11:40 AM
Subject: Re: [PERFORM] Query is slow when executing in procedure


2009/11/24 ramasubramanian :

Dear all,
The query is slow when executing in the stored procedure(it is taking
around 1 minute). when executing as a sql it is taking 4 seconds.
basically i am selecting the varchar column which contain 4000 character. 
We
have as iindex on the table. We have analyzed the table also. What could 
be

the reason. How to improve it?


Hello

use a dynamic query - plpgsql uses prepared statements. It use plans
generated without knowledge of real params. Sometime it should to do
performance problem. EXECUTE statement (in plpgsql) uses new plan for
every call (and generated with knowledge of real params) - so it is a
solution for you.

http://www.postgresql.org/docs/8.4/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN

Regards
Pavel Stehule





Thanks in Advance
Ram


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query is slow when executing in procedure

2009-11-23 Thread ramasubramanian

Thanks a lot Kretschmer. i will try it .

Regards,
Ram

- Original Message - 
From: "A. Kretschmer" 

To: 
Sent: Tuesday, November 24, 2009 11:45 AM
Subject: Re: [PERFORM] Query is slow when executing in procedure



In response to ramasubramanian :

Dear all,
The query is slow when executing in the stored procedure(it is taking
around 1 minute). when executing as a sql it is taking 4 seconds.
basically i am selecting the varchar column which contain 4000 character. 
We
have as iindex on the table. We have analyzed the table also. What could 
be the

reason. How to improve it?


The reason is hard to guess, because you don't provide enough
informations like the function code.

My guess:

You calls the function with a parameter, and the planner isn't able to
chose a fast plan because he doesn't know the parameter. That's why he
is choosen a seq-scan. You can rewrite your function to using dynamical
execute a string that contains your sql to force the planner search an
optimal plan for your actual parameter.

But yes, that's only a wild guess (and sorry about my english...)

Please, show us the table and the function-code.


Regards, Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance