[PERFORM] cpu throttling
We have a complicated stored procedure that we run frequently. It pegs one of our postmaster processes at 100% CPU utilization for a few hours. This has the unfortunate side effect of causing increased latency for our other queries. We are currently planning a fix, but because of the complicated nature of this procedure it is going to take some time to implement. I've noticed that if I renice the process that is running the query, the other postmaster processes are able to respond to our other queries in a timely fashion. My question: Is there a way I can decrease the priority of a specific query, or determine the PID of the process it is running in? I'd like to throw together a quick shell script if at all possible, as right now I have to monitor the process manually and we'll have fixed the problem long before we have the chance to implement proper database clustering. Bryan ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] cpu throttling
On Thursday 02 August 2007 09:02, "Bryan Murphy" <[EMAIL PROTECTED]> wrote: > My question: Is there a way I can decrease the priority of a specific > query, or determine the PID of the process it is running in? I'd like > to throw together a quick shell script if at all possible, as right > now I have to monitor the process manually and we'll have fixed the > problem long before we have the chance to implement proper database > clustering. select procpid from pg_stat_activity where current_query like '%stored_proc%' and current_query not like '%pg_stat_activity%'; requires stats_command_string to be enabled I'm surprised your operating system doesn't automatically lower the priority of the process, though .. -- "Remember when computers were frustrating because they did exactly what you told them to? That actually seems sort of quaint now." --J.D. Baldwin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] cpu throttling
It's a 4 processor Intel xeon machine with more than enough ram. The entire database can fit in memory, and while the CPU is pegged, nothing is chewing up I/O bandwidth, and nothing is getting swapped out of RAM. I'm running Debian stable with only a few tweaks to the kernel's memory settings. As far as I'm aware, I have not changed anything that would impact scheduling. Other queries do respond, but it's more like every couple of seconds one query which normally takes 300ms might take 8000ms. Nothing terrible, but enough that our users will notice. Bryam On 8/2/07, Alan Hodgson <[EMAIL PROTECTED]> wrote: > On Thursday 02 August 2007 09:02, "Bryan Murphy" <[EMAIL PROTECTED]> > wrote: > > My question: Is there a way I can decrease the priority of a specific > > query, or determine the PID of the process it is running in? I'd like > > to throw together a quick shell script if at all possible, as right > > now I have to monitor the process manually and we'll have fixed the > > problem long before we have the chance to implement proper database > > clustering. > > select procpid from pg_stat_activity where current_query >like '%stored_proc%' and current_query not like '%pg_stat_activity%'; > > requires stats_command_string to be enabled > > I'm surprised your operating system doesn't automatically lower the priority > of the process, though .. > > -- > "Remember when computers were frustrating because they did exactly what > you told them to? That actually seems sort of quaint now." --J.D. Baldwin > > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend > ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Why are distinct and group by choosing different plans?
I notice that I get different plans when I run the following two queries that I thought would be identical. select distinct test_col from mytable; select test_col from mytable group by test_col; Any reason why it favors one in one case but not the other? d=# explain analyze select distinct test_col from mytable; QUERY PLAN Unique (cost=0.00..14927.69 rows=27731 width=4) (actual time=0.144..915.214 rows=208701 loops=1) -> Index Scan using "mytable(test_col)" on mytable (cost=0.00..14160.38 rows=306925 width=4) (actual time=0.140..575.580 rows=306925 loops=1) Total runtime: 1013.657 ms (3 rows) d=# explain analyze select test_col from mytable group by test_col; QUERY PLAN HashAggregate (cost=7241.56..7518.87 rows=27731 width=4) (actual time=609.058..745.295 rows=208701 loops=1) -> Seq Scan on mytable (cost=0.00..6474.25 rows=306925 width=4) (actual time=0.063..280.000 rows=306925 loops=1) Total runtime: 840.321 ms (3 rows) d=# select version(); version PostgreSQL 8.2.1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) (1 row) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Why are distinct and group by choosing different plans?
"Ron Mayer" <[EMAIL PROTECTED]> writes: > I notice that I get different plans when I run the > following two queries that I thought would be > identical. > > select distinct test_col from mytable; > select test_col from mytable group by test_col; > > Any reason why it favors one in one case but not the other? I think "distinct" just doesn't know about hash aggregates yet. That's partly an oversight and partly of a "feature" in that it gives a convenient way to write a query which avoids them. I think it's also partly that "distinct" is trickier to fix because it's the same codepath as "distinct on" which is decidedly more complex than a simple "distinct". > d=# explain analyze select distinct test_col from mytable; > > QUERY PLAN > > Unique (cost=0.00..14927.69 rows=27731 width=4) (actual time=0.144..915.214 > rows=208701 loops=1) >-> Index Scan using "mytable(test_col)" on mytable (cost=0.00..14160.38 > rows=306925 width=4) (actual time=0.140..575.580 rows=306925 loops=1) > Total runtime: 1013.657 ms > (3 rows) I assume you have random_page_cost dialled way down? The costs seem too low for the default random_page_cost. This query would usually generate a sort rather than an index scan. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Why are distinct and group by choosing different plans?
Gregory Stark <[EMAIL PROTECTED]> writes: > I think "distinct" just doesn't know about hash aggregates yet. That's partly > an oversight and partly of a "feature" in that it gives a convenient way to > write a query which avoids them. I think it's also partly that "distinct" is > trickier to fix because it's the same codepath as "distinct on" which is > decidedly more complex than a simple "distinct". It's not an oversight :-(. But the DISTINCT/DISTINCT ON code is old, crufty, and tightly entwined with ORDER BY processing. It'd be nice to clean it all up someday, but the effort seems a bit out of proportion to the reward... regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
RES: RES: [PERFORM] Improving select peformance
Hi, In this case, I believe the best choice to improve the performance of this particular SQL statement is adding the 'set join_collapse_limit = 1;' just before the join statement, correct? It there anything else we could do to, in this case, make the planner choose better paths using the default join_collapse_limit? Thank you in advance! Reimer > -Mensagem original- > De: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] nome de Carlos H. > Reimer > Enviada em: quarta-feira, 1 de agosto de 2007 21:26 > Para: Alvaro Herrera > Cc: Tom Lane; pgsql-performance@postgresql.org > Assunto: RES: RES: [PERFORM] Improving select peformance > > > Yes, but as the change did not alter the response time I used the original > view. > > Anyway here are the response times using the changed view (without the > concatenation conditions): > > with join_collapse_limit set to 8: > -- > -- > -- > -- > --- > Nested Loop Left Join (cost=963.68..76116.63 rows=1 width=194) (actual > time=8219.028..1316669.201 rows=256 loops=1) >-> Nested Loop (cost=963.68..76116.23 rows=1 width=198) (actual > time=8196.502..1316638.186 rows=256 loops=1) > Join Filter: ((gra.filmat = ive.filmat) AND (gra.codmat = > ive.codmat) AND (gra.codcor = ive.codcor) AND (gra.codtam = ive.codtam)) > -> Nested Loop (cost=1.11..3370.95 rows=1 width=150) (actual > time=33.058..255.428 rows=414 loops=1) >Join Filter: (div.coddiv = ddiv.codtab) >-> Nested Loop (cost=1.11..3369.89 rows=1 width=159) > (actual time=33.043..249.609 rows=414 loops=1) > Join Filter: (sub.codsub = dsub.codtab) > -> Nested Loop (cost=1.11..3368.82 rows=1 > width=168) > (actual time=33.026..243.603 rows=414 loops=1) >Join Filter: ((gra.codcor)::text = > (div.codite)::text) >-> Hash Join (cost=1.11..3356.11 rows=9 > width=145) (actual time=33.004..222.375 rows=414 loops=1) > Hash Cond: ((gra.codtam)::text = > (sub.codite)::text) > -> Nested Loop (cost=0.00..3352.55 > rows=377 width=122) (actual time=32.810..219.046 rows=414 loops=1) >-> Index Scan using > i_fk_pro_ddep on > tt_pro pro (cost=0.00..123.83 rows=437 width=76) (actual > time=25.199..118.851 rows=414 loops=1) > Index Cond: (1::numeric = > depart) >-> Index Scan using > pk_gra on tt_gra > gra (cost=0.00..7.37 rows=1 width=46) (actual time=0.225..0.231 rows=1 > loops=414) > Index Cond: ((pro.filmat = > gra.filmat) AND (pro.codmat = gra.codmat)) > -> Hash (cost=1.05..1.05 > rows=5 width=32) > (actual time=0.039..0.039 rows=5 loops=1) >-> Seq Scan on tt_sub sub > (cost=0.00..1.05 rows=5 width=32) (actual time=0.009..0.015 > rows=5 loops=1) >-> Seq Scan on tt_div div (cost=0.00..1.15 > rows=15 width=32) (actual time=0.003..0.015 rows=15 loops=414) > -> Seq Scan on td_sub dsub (cost=0.00..1.03 rows=3 > width=9) (actual time=0.002..0.005 rows=3 loops=414) >-> Seq Scan on td_div ddiv (cost=0.00..1.03 > rows=3 width=9) > (actual time=0.002..0.005 rows=3 loops=414) > -> Hash Join (cost=962.57..72738.01 rows=363 width=114) (actual > time=0.588..3178.606 rows=857 loops=414) >Hash Cond: (ive.sequen = ven.sequen) >-> Nested Loop (cost=0.00..69305.21 rows=657761 width=85) > (actual time=0.041..2623.627 rows=656152 loops=414) > -> Seq Scan on td_nat nat (cost=0.00..1.24 rows=1 > width=9) (actual time=0.004..0.012 rows=1 loops=414) >Filter: (-3::numeric = codtab) > -> Seq Scan on tt_ive ive (cost=0.00..62726.36 > rows=657761 width=76) (actual time=0.034..1685.506 rows=656152 loops=414) >Filter: ((sitmov <> 'C'::bpchar) AND > ('001'::bpchar = codfil)) >-> Hash (cost=960.39..960.39 rows=174 width=89) (actual > time=41.542..41.542 rows=394 loops=1) > -> Hash Left Join (cost=3.48..960.39 rows=174 > width=89) (actual time=16.936..40.693 rows=394 loops=1) >Hash Cond: ((ven.filcli = cfg.vc_filcli) AND > (ven.codcli = cfg.vc_codcli)) >-> Hash Join (cost=2.45..958.05 rows=174 > width=106) (actual time=16.895..39.747 rows=394 loops=1) > Hash Cond: ((ven.filpgt = pla.filpgt) AND > (ven.codpgt = pla.codpgt)) > -> Index Scan using i_l
Re: RES: RES: [PERFORM] Improving select peformance
"Carlos H. Reimer" <[EMAIL PROTECTED]> writes: > In this case, I believe the best choice to improve the performance of this > particular SQL statement is adding the 'set join_collapse_limit = 1;' just > before the join statement, correct? That's a mighty blunt instrument. The real problem with your query is the misestimation of the join sizes --- are you sure the table statistics are up to date? Maybe you'd get better estimates with more statistics (ie, increase the stats target for these tables). regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend