Here is the explain plan. There query time went backup up to 2 to 3 minutes from 3 second in just 2 hrs. Can anyone suggest something on how to fix this or why this is happening
explain SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id; QUERY PLAN ----------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..24.84 rows=1 width=64) Join Filter: (run.id = work_unit.run_id) -> Nested Loop (cost=0.00..16.55 rows=1 width=16) -> Index Scan using run_status_1 on run (cost=0.00..8.27 rows=1 width=12) Index Cond: (status = 1) -> Index Scan using account_pkey on account (cost=0.00..8.27 rows=1 width=8) Index Cond: (account.id = run.account_id) -> Index Scan using idx_work_unit_1_status_3_new on work_unit (cost=0.00..8.28 rows=1 width=52) Index Cond: (work_unit.status = 3) Table Structure \d work_unit Table "public.work_unit" Column | Type | Modifiers ------------+-----------------------------+----------- id | integer | not null run_id | integer | status | integer | script | character varying | parameters | character varying | input | character varying | start_time | timestamp without time zone | stop_time | timestamp without time zone | priority | integer | type | integer | lineitems | integer | outputs | integer | sub_type | integer | site_code | text | Indexes: "work_unit_pkey" PRIMARY KEY, btree (id) "idx_work_unit_1_partial" btree (run_id, start_time) WHERE status >= 3 "idx_work_unit_1_run_id" btree (run_id) "idx_work_unit_1_script" btree (script) "idx_work_unit_1_site_code" btree (site_code) "idx_work_unit_1_starttime" btree (start_time) "idx_work_unit_1_status_3_new" btree (status, type) WHERE status = 3 AND type <> 1 "idx_work_unit_1_status_5" btree (status) WHERE status >= 4 "idx_work_unit_1_status_part_new" btree (status) WHERE status < 4 "idx_work_unit_1_stop_time" btree (stop_time) \d run Table "public.run" Column | Type | Modifiers ---------------------+--------------------------+--------------- id | integer | not null job_id | integer | start_time | timestamp with time zone | not null status | integer | not null job_name | character varying | account_id | integer | application_id | integer | postproc_script | character varying | upload_url | character varying | complete_time | timestamp with time zone | email | character varying | size | integer | errors | integer | raw_count | integer | munge | integer | job_details | character varying | user_settings | character varying | run_type | integer | stop_after | timestamp with time zone | total_work_time | integer | flags | integer | cluster_info_id | integer | skiplineitems_count | integer | last_update_ts | timestamp with time zone | result_exists | boolean | default false abort_type | integer | Indexes: "run_pkey" PRIMARY KEY, btree (id) "idx_run_acc_stat_comp" btree (account_id, status, complete_time) "idx_run_app" btree (application_id) "idx_run_complete_time_2" btree (complete_time) "idx_run_job_name" btree (job_name) "run_application_account_idx" btree (application_id, account_id, status) "run_job" btree (job_id) "run_result_exists" btree (result_exists) "run_start_time" btree (start_time) "run_status" btree (status) "run_status_1" btree (status) WHERE status = 1 "run_status_part_idx" btree (status) WHERE status < 3 Triggers: run_upd_ts BEFORE INSERT OR UPDATE ON run FOR EACH ROW EXECUTE PROCEDURE set_last_update_ts() tr_proc_update_job_summary AFTER UPDATE ON run FOR EACH ROW EXECUTE PROCEDURE tr_proc_update_job_summary() On Thu, Jul 17, 2014 at 12:00 PM, Potentialtech <wmo...@potentialtech.com> wrote: > On Thu, 17 Jul 2014 11:28:04 -0700 > Prabhjot Sheena <prabhjot.she...@rivalwatch.com> wrote: > > > Thanks i have changed that to 64 and reloaded it. > > > > When i had load issue today there was this exact same query that hits the > > db like 50 to 60 times from different machines in 3 to 4 minutes and was > > taking long time to execute and was holding up the database. i did > recreate > > an index and it started performing better. My question is why it is not > > fetching the result from the memory since its the same query that runs > > again and again. > > > > This is the actual query i m taking about: > > > > SELECT account.id, account.organization_id, run.application_id, > > work_unit.script, work_unit.id, work_unit.start_time, run.id, > > work_unit.priority FROM work_unit, run, account WHERE > work_unit.status = > > 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status > = 1 > > AND run.account_id = account.id > > > > Pls suggest if i can do something to fix this > > Provide a lot more information if you want anyone on the list to be able > to help: such as explain output while the problem is happening, and some > information about the makeup of the tables (column types/indexes/# rows). > > Guessing, based on the little information you've provided, it's likely > that you have something else going on at the same time that you're not > aware of, and this particular query is only a symptom. I'm saying that > because SELECTs don't generally create any WAL traffic, so there were > probably some INSERT/UPDATE/DELETE running at the same time that both > pushed those 3 tables out of memory and/or saturated disk activity to > the point that accessing everything becomes slow for a short while, and > it's just those queries that you noticed. > > Are you making the mistake where you set log_min_duration to 1s and only > worry about queries that take longer than 1s? Because I've seen (on > multiple occasions) where many 1000s of queries, each running less than > 1s, are the actual cause of the problem. pgBadger is particularly helpful > in tracking down situations like that. > > > On Thu, Jul 17, 2014 at 11:06 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > > > > > Potentialtech <wmo...@potentialtech.com> writes: > > > > If the warning isn't happening too often, I would try increasing it > only > > > a > > > > little and see if it helps. If it's not enough you can then > increase it > > > some > > > > more. Various sources around the Internet suggest that you don't > want > > > to go > > > > much larger than 256 for this (if only because it's uncommon to do so > > > and is > > > > probably indicative of other tuning that you need to do). > > > Unfortunatley, you > > > > need to restart PG for the change to take effect, so you have to > balance > > > > experimenting with your tuning against how often you can get away > with a > > > server > > > > restart. > > > > > > Huh? You don't need a restart, just a reload (SIGHUP) to change that. > > > > > > regards, tom lane > > > > > > -- > Potentialtech <wmo...@potentialtech.com> >