Re: [PERFORM] Why is the query not using the index for sorting?
On Sun, 22 Nov 2009, Jonathan Blitz wrote: I have a table with a number of columns. I perform Select * from table order by a,b There is an index on a,b which is clustered (as well as indexes on a and b alone). I have issued the cluster and anyalze commands. Did you analyse *after* creating the index and clustering, or before? Matthew -- [About NP-completeness] These are the problems that make efficient use of the Fairy Godmother.-- Computer Science Lecturer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query times change by orders of magnitude as DB ages
On Sun, 22 Nov 2009, Richard Neill wrote: Worse still, doing a cluster of most of the tables and vacuum full analyze Why are you doing a vacuum full? That command is not meant to be used except in the most unusual of circumstances, as it causes bloat to indexes. If you have run a cluster command, then running vacuum full will make the table and index layout worse, not better. Matthew -- Riker: Our memory pathways have become accustomed to your sensory input. Data: I understand - I'm fond of you too, Commander. And you too Counsellor -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why is the query not using the index for sorting?
Definitely after. Jonathan -Original Message- From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Matthew Wakeling Sent: Monday, November 23, 2009 1:00 PM To: Jonathan Blitz Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Why is the query not using the index for sorting? On Sun, 22 Nov 2009, Jonathan Blitz wrote: > I have a table with a number of columns. > > I perform > > Select * > from table > order by a,b > > There is an index on a,b which is clustered (as well as indexes on a and b alone). > I have issued the cluster and anyalze commands. Did you analyse *after* creating the index and clustering, or before? Matthew -- [About NP-completeness] These are the problems that make efficient use of the Fairy Godmother.-- Computer Science Lecturer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.709 / Virus Database: 270.14.76/2517 - Release Date: 11/22/09 21:40:00 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] View based upon function won't use index on joins
2009/11/20 Jonathan Foy : > Shy of just throwing a trigger in the > table to actually populate a second table with the same data solely for > reporting purposes, That's what I would do in your situation, FWIW. Query optimization is a hard problem even under the best of circumstances; getting the planner to DTRT with a crazy schema is - well, really hard. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query times change by orders of magnitude as DB ages
On Sun, Nov 22, 2009 at 10:31 AM, Richard Neill wrote: > Dear All, > > Thanks for your help earlier with the previous question. I wonder if I might > ask another. > > > We have various queries that need to run, of which I'm going to focus on 2, > "vox" and "du_report". > > Both of them are extremely sensitive to the precise values of > random_page_cost and seq_page_cost. Experimentally, I've used: > > A: seq_page_cost = 0.25; random_page_cost = 0.75 > B: seq_page_cost = 0.5; random_page_cost = 2 > C: seq_page_cost = 1; random_page_cost = 4 > > (and a few in between). > > > If I pick the wrong one, then either vox becomes 2 orders of magnitude > slower (22ms -> 3.5 seconds), or du_report becomes 10x slower. I can't use > the same setting for both. > > So, as a very ugly hack, I've tuned the sweet spots for each query. > Vox normally sits at B; du_report at C. > > > Now, the real killer is that the position of that sweet spot changes over > time as the DB ages over a few days (even though autovacuum is on). > > Worse still, doing a cluster of most of the tables and vacuum full analyze > made most of the queries respond much better, but the vox query became very > slow again, until I set it to A (which, a few days ago, did not work well). > > > * Why is the query planner so precisely sensitive to the combination of > page costs and time since last vacuum full? It sounds like your tables are getting bloated. If you have autovacuum turned on, this shouldn't be happening. What sort of workload is this? What PG version? > * Why is it that what improves one query can make another get so much worse? Because it changes the plan you get. > * Is there any way I can nail the query planner to a particular query plan, > rather than have it keep changing its mind? See other responses. > * Is it normal to keep having to tune the query-planner's settings, or > should it be possible to set it once, and leave it? Leave it. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance degrade running on multicore computer
On 01/-10/-28163 11:59 AM, afancy wrote: > Hi, > > My PostgreSQL server has two CPUs (OS: Fedora 11), each with 4 cores. > Total is 8cores. Now I have several clients running at the same time > to do insert and update on the same table, each client having its own > connection. I have made two testing with clients running in > parallel to load 20M data in total. Each testing, the data is split > evenly by the client number such that each client only loads a piece > of data. > > 1) Long transaction: A client does the commit at the end of loading. > Result: Each postgres consumes 95% CPU. The more clients run in > parallel, the slower the total runing time is (when 8 clients, it is > slowest). However, I expect the more clients run in parallel, it > should be faster to load all the data. > > 2) Short transaction: I set the clients to do a commit on loading > every 500 records. Results: Each postgres consumes about 50%CPU. Now > the total running is as what i have expected; the more clients run in > parallel, the faster it is (when 8 clients, it is fastest). > > Could anybody help to why when I do the long transaction with 8 > clients, it is slowest? How can I solve this problem? As I don't want > to use the 2), in which I have to set the commit size each time. > > Thanks a lot!! > > -Afancy > Since you have 2 cpus, you may want to try setting the processor affinity for postgres (server and client programs) to the 4 cores on one of the cpus (taskset command on linux). Here's an excerpt from a modified /etc/init.d/postgresql: $SU -l postgres -c "taskset -c 4-7 $PGENGINE/postmaster -p '$PGPORT' -D '$PGDATA' ${PGOPTS} &" >> "$PGLOG" 2>&1 < /dev/null Thanks to Greg Smith to pointing this out when we had a similar issue w/a 2-cpu server. NB: This was with postgresql 8.3. Don't know if 8.4+ has built-in processor affinity. (Apologies in advance for the email formatting.) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [GENERAL] Strange performance degradation
Tom Lane wrote: Lorenzo Allegrucci writes: So, my main question is.. how can just a plain simple restart of postgres restore the original performance (3% cpu time)? Are you killing off any long-running transactions when you restart? After three days of patient waiting it looks like the common ' in transaction' problem.. [sorry for >80 cols] 19329 ?S 15:54 /usr/lib/postgresql/8.3/bin/postgres -D /var/lib/postgresql/8.3/main -c config_file=/etc/postgresql/8.3/main/postgresql.conf 19331 ?Ss 3:40 \_ postgres: writer process 19332 ?Ss 0:42 \_ postgres: wal writer process 19333 ?Ss15:01 \_ postgres: stats collector process 19586 ?Ss 114:00 \_ postgres: forinicom weadmin [local] idle 20058 ?Ss 0:00 \_ postgres: forinicom weadmin [local] idle 13136 ?Ss 0:00 \_ postgres: forinicom weadmin 192.168.4.253(43721) idle in transaction My app is a Django webapp, maybe there's some bug in the Django+psycopg2 stack? Anyway, how can I get rid those "idle in transaction" processes? Can I just kill -15 them or is there a less drastic way to do it? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Best possible way to insert and get returned ids
Question: Is an INSERT command with a SELECT statement in the RETURNING * parameter faster than say an INSERT and then a SELECT? Does the RETURNING * parameter simply amount to a normal SELECT command on the added rows? We need to basically insert a lot of rows as fast as possible, and get the ids that were added. The number of rows we are inserting is dynamic and is not of fixed length. Thanks, -Jason -- Check out the Barracuda Spam & Virus Firewall - offering the fastest virus & malware protection in the industry: www.barracudanetworks.com/spam
Re: [PERFORM] [GENERAL] Strange performance degradation
Bill Moran writes: > In response to Lorenzo Allegrucci : >> Tom Lane wrote: >>> Are you killing off any long-running transactions when you restart? >> Anyway, how can I get rid those "idle in transaction" processes? >> Can I just kill -15 them or is there a less drastic way to do it? > Connections idle in transaction do not cause performance problems simply > by being there, at least not when there are so few. The idle transaction doesn't eat resources in itself. What it does do is prevent VACUUM from reclaiming dead rows that are recent enough that they could still be seen by the idle transaction. The described behavior sounds to me like other transactions are wasting lots of cycles scanning through dead-but-not-yet-reclaimed rows. There are some other things that also get slower as the window between oldest and newest active XID gets wider. (8.4 alleviates this problem in many cases, but the OP said he was running 8.3.) > If you -TERM them, any uncommitted data will be rolled back, which may > not be what you want. Don't -KILL them, that will upset the postmaster. -TERM isn't an amazingly safe thing either in 8.3. Don't you have a way to kill the client-side sessions? > My answer to your overarching question is that you need to dig deeper to > find the real cause of your problem, you're just starting to isolate it. Agreed, what you really want to do is find and fix the transaction leak on the client side. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Best possible way to insert and get returned ids
On Mon, Nov 23, 2009 at 1:53 PM, Jason Dictos wrote: > Question: > > Is an INSERT command with a SELECT statement in the RETURNING * parameter > faster than say an INSERT and then a SELECT? Does the RETURNING * parameter > simply amount to a normal SELECT command on the added rows? We need to > basically insert a lot of rows as fast as possible, and get the ids that > were added. The number of rows we are inserting is dynamic and is not of > fixed length. Well, if you do an insert, then a select, how can you tell, with that select, which rows you just inserted? how can you be sure they're not somebody elses? Insert returning is fantastic for this type of thing. The beauty of it is that it returns a SET if you insert multiple rows. And, if you've got two insert threads running, and one inserts to a sequence a set of rows with pk values of 10,11,13,15,18,20 while another thread inserts to the same table and creates a set of rows with pk values of 12,14,16,17,19 then those are the two sets you'll get back with returning. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] query optimization
Hi everybody, I've got two queries that needs optimizing. Actually, there are others, but these are pretty representative. You can see the queries and the corresponding plans at http://bulldog.duhs.duke.edu/~faheem/snpdb/opt.pdf or http://bulldog.duhs.duke.edu/~faheem/snpdb/opt.tex if you prefer text (latex file, effectively text in this case) The background to this is at http://bulldog.duhs.duke.edu/~faheem/snpdb/diag.pdf If more details are required, let me know and I can add them. I'd appreciate suggestions about how to make these queries go faster. Please CC this email address on any replies. Regards, Faheem. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query optimization
2009/11/23 Faheem Mitha > > Hi everybody, > > I've got two queries that needs optimizing. Actually, there are others, but > these are pretty representative. > > You can see the queries and the corresponding plans at > > http://bulldog.duhs.duke.edu/~faheem/snpdb/opt.pdf > > or > > http://bulldog.duhs.duke.edu/~faheem/snpdb/opt.tex > > if you prefer text (latex file, effectively text in this case) > > The background to this is at > http://bulldog.duhs.duke.edu/~faheem/snpdb/diag.pdf > > If more details are required, let me know and I can add them. I'd > appreciate suggestions about how to make these queries go faster. > > Please CC this email address on any replies. > > Regards, Faheem. > > > Hi Faheem, There appears to be a discrepancy between the 2 PDFs you provided. One says you're using PostgreSQL 8.3, and the other shows you using common table expressions, which are only available in 8.4+. Thom
Re: [PERFORM] query optimization
On Mon, 23 Nov 2009, Thom Brown wrote: Hi Faheem, There appears to be a discrepancy between the 2 PDFs you provided. One says you're using PostgreSQL 8.3, and the other shows you using common table expressions, which are only available in 8.4+. Yes, sorry. I'm using Postgresql 8.4. I guess I should go through diag.pdf and make sure all the information is current. Thanks for pointing out my error. Regards, Faheem. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query optimization
On Tue, Nov 24, 2009 at 12:49 AM, Faheem Mitha wrote: > > Yes, sorry. I'm using Postgresql 8.4. I guess I should go through diag.pdf > and make sure all the information is current. Thanks for pointing out my > error. > excellent report! about the copy problem: You seem to have created the primary key before doing the copy (at least that`s what the dump before copy says). This is bad. Create it after the copy. Greetings Marcin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query optimization
How often are the tables you query from updated? Rgds Sebastian On Tue, Nov 24, 2009 at 12:52 AM, marcin mank wrote: > On Tue, Nov 24, 2009 at 12:49 AM, Faheem Mitha > wrote: > > > > Yes, sorry. I'm using Postgresql 8.4. I guess I should go through > diag.pdf > > and make sure all the information is current. Thanks for pointing out my > > error. > > > > excellent report! > > about the copy problem: You seem to have created the primary key > before doing the copy (at least that`s what the dump before copy > says). This is bad. Create it after the copy. > > Greetings > Marcin > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
Re: [PERFORM] query optimization
On Tue, 24 Nov 2009, Sebastian Jörgensen wrote: How often are the tables you query from updated? Quite rarely. Once in a while. The large tables, eg. geno, are basically static. Regards, Faheem. Rgds Sebastian On Tue, Nov 24, 2009 at 12:52 AM, marcin mank wrote: On Tue, Nov 24, 2009 at 12:49 AM, Faheem Mitha wrote: > > Yes, sorry. I'm using Postgresql 8.4. I guess I should go through diag.pdf > and make sure all the information is current. Thanks for pointing out my > error. > excellent report! about the copy problem: You seem to have created the primary key before doing the copy (at least that`s what the dump before copy says). This is bad. Create it after the copy. Greetings Marcin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Query is slow when executing in procedure
Dear all, The query is slow when executing in the stored procedure(it is taking around 1 minute). when executing as a sql it is taking 4 seconds. basically i am selecting the varchar column which contain 4000 character. We have as iindex on the table. We have analyzed the table also. What could be the reason. How to improve it? Thanks in Advance Ram
Re: [PERFORM] Query is slow when executing in procedure
2009/11/24 ramasubramanian : > Dear all, > The query is slow when executing in the stored procedure(it is taking > around 1 minute). when executing as a sql it is taking 4 seconds. > basically i am selecting the varchar column which contain 4000 character. We > have as iindex on the table. We have analyzed the table also. What could be > the reason. How to improve it? Hello use a dynamic query - plpgsql uses prepared statements. It use plans generated without knowledge of real params. Sometime it should to do performance problem. EXECUTE statement (in plpgsql) uses new plan for every call (and generated with knowledge of real params) - so it is a solution for you. http://www.postgresql.org/docs/8.4/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN Regards Pavel Stehule > > Thanks in Advance > Ram -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query is slow when executing in procedure
In response to ramasubramanian : > Dear all, > The query is slow when executing in the stored procedure(it is taking > around 1 minute). when executing as a sql it is taking 4 seconds. > basically i am selecting the varchar column which contain 4000 character. We > have as iindex on the table. We have analyzed the table also. What could be > the > reason. How to improve it? The reason is hard to guess, because you don't provide enough informations like the function code. My guess: You calls the function with a parameter, and the planner isn't able to chose a fast plan because he doesn't know the parameter. That's why he is choosen a seq-scan. You can rewrite your function to using dynamical execute a string that contains your sql to force the planner search an optimal plan for your actual parameter. But yes, that's only a wild guess (and sorry about my english...) Please, show us the table and the function-code. Regards, Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query is slow when executing in procedure
Thanks a lot Pavel . i will try it . - Original Message - From: "Pavel Stehule" To: "ramasubramanian" Cc: Sent: Tuesday, November 24, 2009 11:40 AM Subject: Re: [PERFORM] Query is slow when executing in procedure 2009/11/24 ramasubramanian : Dear all, The query is slow when executing in the stored procedure(it is taking around 1 minute). when executing as a sql it is taking 4 seconds. basically i am selecting the varchar column which contain 4000 character. We have as iindex on the table. We have analyzed the table also. What could be the reason. How to improve it? Hello use a dynamic query - plpgsql uses prepared statements. It use plans generated without knowledge of real params. Sometime it should to do performance problem. EXECUTE statement (in plpgsql) uses new plan for every call (and generated with knowledge of real params) - so it is a solution for you. http://www.postgresql.org/docs/8.4/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN Regards Pavel Stehule Thanks in Advance Ram -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query is slow when executing in procedure
Thanks a lot Kretschmer. i will try it . Regards, Ram - Original Message - From: "A. Kretschmer" To: Sent: Tuesday, November 24, 2009 11:45 AM Subject: Re: [PERFORM] Query is slow when executing in procedure In response to ramasubramanian : Dear all, The query is slow when executing in the stored procedure(it is taking around 1 minute). when executing as a sql it is taking 4 seconds. basically i am selecting the varchar column which contain 4000 character. We have as iindex on the table. We have analyzed the table also. What could be the reason. How to improve it? The reason is hard to guess, because you don't provide enough informations like the function code. My guess: You calls the function with a parameter, and the planner isn't able to chose a fast plan because he doesn't know the parameter. That's why he is choosen a seq-scan. You can rewrite your function to using dynamical execute a string that contains your sql to force the planner search an optimal plan for your actual parameter. But yes, that's only a wild guess (and sorry about my english...) Please, show us the table and the function-code. Regards, Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance