Re: [GENERAL] Checkpoint_segments optimal value

Prabhjot Sheena Thu, 17 Jul 2014 16:27:34 -0700

Here is the explain plan. There query time went backup up to 2 to 3 minutes
from 3 second  in just 2 hrs.  Can anyone suggest something on how to fix
this or why this is happening


 explain SELECT account.id, account.organization_id, run.application_id,
work_unit.script, work_unit.id, work_unit.start_time, run.id,
work_unit.priority  FROM work_unit, run, account  WHERE  work_unit.status =
3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1
AND run.account_id = account.id;
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..24.84 rows=1 width=64)
   Join Filter: (run.id = work_unit.run_id)
   ->  Nested Loop  (cost=0.00..16.55 rows=1 width=16)
         ->  Index Scan using run_status_1 on run  (cost=0.00..8.27 rows=1
width=12)
               Index Cond: (status = 1)
         ->  Index Scan using account_pkey on account  (cost=0.00..8.27
rows=1 width=8)
               Index Cond: (account.id = run.account_id)
   ->  Index Scan using idx_work_unit_1_status_3_new on work_unit
(cost=0.00..8.28 rows=1 width=52)
         Index Cond: (work_unit.status = 3)

Table Structure

 \d work_unit
               Table "public.work_unit"
   Column   |            Type             | Modifiers
------------+-----------------------------+-----------
 id         | integer                     | not null
 run_id     | integer                     |
 status     | integer                     |
 script     | character varying           |
 parameters | character varying           |
 input      | character varying           |
 start_time | timestamp without time zone |
 stop_time  | timestamp without time zone |
 priority   | integer                     |
 type       | integer                     |
 lineitems  | integer                     |
 outputs    | integer                     |
 sub_type   | integer                     |
 site_code  | text                        |
Indexes:
    "work_unit_pkey" PRIMARY KEY, btree (id)
    "idx_work_unit_1_partial" btree (run_id, start_time) WHERE status >= 3
    "idx_work_unit_1_run_id" btree (run_id)
    "idx_work_unit_1_script" btree (script)
    "idx_work_unit_1_site_code" btree (site_code)
    "idx_work_unit_1_starttime" btree (start_time)
    "idx_work_unit_1_status_3_new" btree (status, type) WHERE status = 3
AND type <> 1
    "idx_work_unit_1_status_5" btree (status) WHERE status >= 4
    "idx_work_unit_1_status_part_new" btree (status) WHERE status < 4
    "idx_work_unit_1_stop_time" btree (stop_time)


 \d run
                       Table "public.run"
       Column        |           Type           |   Modifiers
---------------------+--------------------------+---------------
 id                  | integer                  | not null
 job_id              | integer                  |
 start_time          | timestamp with time zone | not null
 status              | integer                  | not null
 job_name            | character varying        |
 account_id          | integer                  |
 application_id      | integer                  |
 postproc_script     | character varying        |
 upload_url          | character varying        |
 complete_time       | timestamp with time zone |
 email               | character varying        |
 size                | integer                  |
 errors              | integer                  |
 raw_count           | integer                  |
 munge               | integer                  |
 job_details         | character varying        |
 user_settings       | character varying        |
 run_type            | integer                  |
 stop_after          | timestamp with time zone |
 total_work_time     | integer                  |
 flags               | integer                  |
 cluster_info_id     | integer                  |
 skiplineitems_count | integer                  |
 last_update_ts      | timestamp with time zone |
 result_exists       | boolean                  | default false
 abort_type          | integer                  |
Indexes:
    "run_pkey" PRIMARY KEY, btree (id)
    "idx_run_acc_stat_comp" btree (account_id, status, complete_time)
    "idx_run_app" btree (application_id)
    "idx_run_complete_time_2" btree (complete_time)
    "idx_run_job_name" btree (job_name)
    "run_application_account_idx" btree (application_id, account_id, status)
    "run_job" btree (job_id)
    "run_result_exists" btree (result_exists)
    "run_start_time" btree (start_time)
    "run_status" btree (status)
    "run_status_1" btree (status) WHERE status = 1
    "run_status_part_idx" btree (status) WHERE status < 3
Triggers:
    run_upd_ts BEFORE INSERT OR UPDATE ON run FOR EACH ROW EXECUTE
PROCEDURE set_last_update_ts()
    tr_proc_update_job_summary AFTER UPDATE ON run FOR EACH ROW EXECUTE
PROCEDURE tr_proc_update_job_summary()




On Thu, Jul 17, 2014 at 12:00 PM, Potentialtech <wmo...@potentialtech.com>
wrote:

> On Thu, 17 Jul 2014 11:28:04 -0700
> Prabhjot Sheena <prabhjot.she...@rivalwatch.com> wrote:
>
> > Thanks i have changed that to 64 and reloaded it.
> >
> > When i had load issue today there was this exact same query that hits the
> > db like 50 to 60 times from different machines in 3 to 4 minutes and was
> > taking long time to execute and was holding up the database. i did
> recreate
> > an index and it started performing better. My question is why it is not
> > fetching the result from the memory since  its the same query that runs
> > again and again.
> >
> > This is the actual query i m taking about:
> >
> > SELECT account.id, account.organization_id, run.application_id,
> > work_unit.script, work_unit.id, work_unit.start_time, run.id,
> > work_unit.priority  FROM work_unit, run, account  WHERE
>  work_unit.status =
> > 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status
> = 1
> > AND run.account_id = account.id
> >
> > Pls suggest if i can do something to fix this
>
> Provide a lot more information if you want anyone on the list to be able
> to help: such as explain output while the problem is happening, and some
> information about the makeup of the tables (column types/indexes/# rows).
>
> Guessing, based on the little information you've provided, it's likely
> that you have something else going on at the same time that you're not
> aware of, and this particular query is only a symptom.  I'm saying that
> because SELECTs don't generally create any WAL traffic, so there were
> probably some INSERT/UPDATE/DELETE running at the same time that both
> pushed those 3 tables out of memory and/or saturated disk activity to
> the point that accessing everything becomes slow for a short while, and
> it's just those queries that you noticed.
>
> Are you making the mistake where you set log_min_duration to 1s and only
> worry about queries that take longer than 1s?  Because I've seen (on
> multiple occasions) where many 1000s of queries, each running less than
> 1s, are the actual cause of the problem.  pgBadger is particularly helpful
> in tracking down situations like that.
>
> > On Thu, Jul 17, 2014 at 11:06 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> >
> > > Potentialtech <wmo...@potentialtech.com> writes:
> > > > If the warning isn't happening too often, I would try increasing it
> only
> > > a
> > > > little and see if it helps.  If it's not enough you can then
> increase it
> > > some
> > > > more.  Various sources around the Internet suggest that you don't
> want
> > > to go
> > > > much larger than 256 for this (if only because it's uncommon to do so
> > > and is
> > > > probably indicative of other tuning that you need to do).
> > >  Unfortunatley, you
> > > > need to restart PG for the change to take effect, so you have to
> balance
> > > > experimenting with your tuning against how often you can get away
> with a
> > > server
> > > > restart.
> > >
> > > Huh?  You don't need a restart, just a reload (SIGHUP) to change that.
> > >
> > >                         regards, tom lane
> > >
>
>
> --
> Potentialtech <wmo...@potentialtech.com>
>

Re: [GENERAL] Checkpoint_segments optimal value

Reply via email to