Re: [PERFORM] Problems with pg_locks explosion

2013-04-02 Thread Armand du Plessis
Touch wood but I think I found the problem thanks to these pointers. I
checked the vm.zone_reclaim_mode and mine was set to 0. However just before
the locking starts I can see many of my CPUs flashing red and jump to high
percentage sys usage. When I look at top it's the migration kernel tasks
that seem to trigger it.

So it seems it was a bit trigger happy with task migrations, setting
the kernel.sched_migration_cost
to 500 (5ms) seemed to have resolved my woes. I'm yet to see locks
climb and it's been running stable for a bit. This post was invaluable in
explaining the cause ->
http://www.postgresql.org/message-id/50e4aab1.9040...@optionshouse.com

# Postgres Kernel Tweaks
kernel.sched_migration_cost = 500
# kernel.sched_autogroup_enabled = 0

The second recommended setting 'sched_autogroup_enabled' is not available
on the kernel I'm running but it doesn't seem to be a problem.

Again, thanks again for the help. It was seriously appreciated. Long night
was long.

If things change and the problem pops up again I'll update you guys.

Cheers,

Armand


On Tue, Apr 2, 2013 at 8:43 AM, Mark Kirkwood  wrote:

> Also it is worth checking what your sysctl vm.zone_reclaim_mode is set to
> - if 1 then override to 0. As Jeff mentioned, this gotcha for larger cpu
> number machines has been discussed at length on this list - but still traps
> us now and again!
>
> Cheers
>
> Mark
>
>
> On 02/04/13 19:33, Armand du Plessis wrote:
>
>> I had my reservations about my almost 0% IO usage on the raid0 array as
>> well. I'm looking at the numbers in atop and it doesn't seem to reflect
>> the aggregate of the volumes as one would expect. I'm just happy I am
>> seeing numbers on the volumes, they're not too bad.
>>
>> One thing I was wondering, as a last possible IO resort. Provisioned EBS
>> volumes requires that you maintain a wait queue of 1 for every 200
>> provisioned IOPS to get reliable IO. My wait queue hovers between 0-1
>> and with the 1000 IOPS it should be 5. Even thought about artificially
>> pushing more IO to the volumes but I think Jeff's right, there's some
>> internal kernel voodoo at play here. I have a feeling it'll be under
>> control with pg_pool (if I can just get the friggen setup there right)
>> and then I'll have more time to dig into it deeper.
>>
>> Apologies to the kittens for the interrupting your leave :)
>>
>>
>


Re: [PERFORM] Postgres upgrade, security release, where?

2013-04-02 Thread Dave Page
On Mon, Apr 1, 2013 at 11:43 PM, Mark Kirkwood
 wrote:
> On 02/04/13 13:55, Bruce Momjian wrote:
>>
>> On Tue, Apr  2, 2013 at 09:40:07AM +0900, Ian Lawrence Barwick wrote:

 Due to the security nature of the release, the source and binaries will
 only be publicly available on April 4 --- there are no pre-release
 versions available.
>>>
>>>
>>> The PostgreSQL homepage has a big announcement saying
>>> "PostgreSQL minor versions released!", including a mention of a
>>> "security issue";
>>> unfortunately it's not obvious that this is for the prior 9.2.3 release
>>> and as
>>> the announcement of the upcoming security release
>>> ( http://www.postgresql.org/about/news/1454/ ) does not mention the
>>> new release number, methinks there is plenty of room for confusion :(
>>>
>>> It might be an idea to update the "splash box" with details of the
>>> upcoming
>>> release.
>>
>>
>> I agree updating the "spash box" would make sense.
>>
>
> Or perhaps include a date on said splashes, so we know when to panic :-)

I've added the date to the splash. You can cease panicing now :-)

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres upgrade, security release, where?

2013-04-02 Thread Mark Kirkwood

On 02/04/13 21:34, Dave Page wrote:

On Mon, Apr 1, 2013 at 11:43 PM, Mark Kirkwood
 wrote:

On 02/04/13 13:55, Bruce Momjian wrote:


On Tue, Apr  2, 2013 at 09:40:07AM +0900, Ian Lawrence Barwick wrote:


Due to the security nature of the release, the source and binaries will
only be publicly available on April 4 --- there are no pre-release
versions available.



The PostgreSQL homepage has a big announcement saying
"PostgreSQL minor versions released!", including a mention of a
"security issue";
unfortunately it's not obvious that this is for the prior 9.2.3 release
and as
the announcement of the upcoming security release
( http://www.postgresql.org/about/news/1454/ ) does not mention the
new release number, methinks there is plenty of room for confusion :(

It might be an idea to update the "splash box" with details of the
upcoming
release.



I agree updating the "spash box" would make sense.



Or perhaps include a date on said splashes, so we know when to panic :-)


I've added the date to the splash. You can cease panicing now :-)



...wipes forehead...



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Join between 2 tables always executes a sequential scan on the larger table

2013-04-02 Thread Dieter Rehbein
Hi everybody,

in a project I have a performance problem, which I (and my colleagues) don't 
understand. It's a simple join between 2 of 3 tables:

table-1:   user   (id, user_name, ...).   This table has about 1 million rows 
(999673 rows)
table-2:   competition (57 rows)
table-3:   user_2_competition.  A relation between user and competition. This 
table has about 100.000 rows

The query is a join between table user_2_competition and user and looks like 
this:

select u.id, u.user_name
from user_2_competition uc 
  left join "user" u on u.id = uc.user_id 
where uc.competition_id = '3cc1cb9b3ac132ad013ad01316040001'

The query returns the ID and user_name of all users participating in a 
competition.

What I don't understand: This query executes a sequential scan on user!


The tables have the following indexes:

user_2_competition:  there is an index on user_id and an index on 
competition_id (competition_id is a VARCHAR(32) containing UUIDs)
user:  id is the primary key and has therefore a unique index (the ID is a 
VARCHAR(32), which contains UUIDs).

The database has just been restored from a backup, I've executed ANALYZE for 
both tables.

The output of explain analyze (Postgres 9.2.3):

Hash Left Join  (cost=111357.64..126222.29 rows=41396 width=42) (actual 
time=1982.543..2737.331 rows=41333 loops=1)
  Hash Cond: ((uc.user_id)::text = (u.id)::text)
  ->  Seq Scan on user_2_competition uc  (cost=0.00..4705.21 rows=41396 
width=33) (actual time=0.019..89.691 rows=41333 loops=1)
Filter: ((competition_id)::text = 
'3cc1cb9b3ac132ad013ad01316040001'::text)
Rows Removed by Filter: 80684
  ->  Hash  (cost=90074.73..90074.73 rows=999673 width=42) (actual 
time=1977.604..1977.604 rows=999673 loops=1)
Buckets: 2048  Batches: 128  Memory Usage: 589kB
->  Seq Scan on "user" u  (cost=0.00..90074.73 rows=999673 width=42) 
(actual time=0.004..1178.827 rows=999673 loops=1)
Total runtime: 2740.723 ms


I expected to see an index-scan on user_2_competition with a hash join to user, 
not a sequential scan on user.  I've tried this with Postgres 9.1 and 9.2.3).

Any ideas, what's going on here?

With EXPLAIN ANALYZE I can see, which query plan Postgres is using.  Is there 
any way to find out, WHY postgres uses this query plan? 

best regards
Dieter




The full table schema:

CREATE TABLE user_2_competition
(
idvarchar(32)  NOT NULL,
version   int4 NOT NULL DEFAULT 0,
conditions_confirm_ip varchar(30),
created_date  timestampNOT NULL DEFAULT now(),
deleted   bool NOT NULL DEFAULT false,
last_visittimestamp,
resort_id int4,
role  varchar(255),
caid  int4 NOT NULL,
ponr  int4 NOT NULL,
ktka  int4 NOT NULL,
lfnr  int4 NOT NULL,
total_visits  int8 NOT NULL DEFAULT 0,
verified  bool NOT NULL,
competition_idvarchar(32),
user_id   varchar(32),
competition_terms int4 NOT NULL DEFAULT (-1),
disqualified  bool NOT NULL DEFAULT false,
registration_key_id   int4,

PRIMARY KEY(id)
);

-- Indexes 
CREATE INDEX IDX_USER_ID ON user_2_competition USING btree (user_id);
CREATE INDEX idx_user_2_competition_competition ON user_2_competition USING 
btree (competition_id);
CREATE UNIQUE INDEX user_2_competition_user_id_competition_id_key ON 
user_2_competition USING btree (user_id, competition_id);

-- Foreign key constraints ---
ALTER TABLE user_2_competition
ADD CONSTRAINT fk_user_competition_competition_group
FOREIGN KEY (competition_id) REFERENCES competition (id) ON DELETE CASCADE;
ALTER TABLE user_2_competition
ADD CONSTRAINT fk_user_2_competition_registration_key
FOREIGN KEY (registration_key_id) REFERENCES competition_registration_key 
(id);
ALTER TABLE user_2_competition
ADD CONSTRAINT fk_user_competition_terms
FOREIGN KEY (competition_terms) REFERENCES competition_terms (id);
ALTER TABLE user_2_competition
ADD CONSTRAINT fk_user_competition_user
FOREIGN KEY (user_id) REFERENCES user (id) ON DELETE CASCADE;

-


CREATE TABLE competition
(
id   varchar(32)   NOT NULL,
version  int4  NOT NULL DEFAULT 0,
created_by   varchar(255),
created_date timestamp,
modified_by  varchar(255),
modified_datetimestamp,
deleted  bool  NOT NULL DEFAULT false,
active   bool  NOT NULL DEFAULT false,
average_scorefloat8,
start_time   timestamp NOT 

Re: [PERFORM] Problems with pg_locks explosion

2013-04-02 Thread Armand du Plessis
Jumped the gun a bit. the problem still exists like before. But it's
definitely on the right track, below is the output from top in the seconds
before the cluster locks up. For some reason still insisting on moving
tasks around despite bumping the sched_migration_cost cost up to 100ms.

77 root  RT   0 000 S 32.3  0.0  13:55.20 [migration/24]



26512 postgres  20   0 8601m 7388 4992 R 32.3  0.0   0:02.17 postgres:
other_user (52944) INSERT


   38 root  RT   0 000 S 31.3  0.0  17:26.15 [migration/11]


   65 root  RT   0 0
 00 S 30.0  0.0  13:18.66 [migration/20]



   62 root  RT   0 000 S 29.7  0.0  12:58.81 [migration/19]



   47 root  RT   0 000 S 29.0  0.0  18:16.43 [migration/14]



   29 root  RT   0 000 S 28.7  0.0  25:21.47 [migration/8]



   71 root  RT   0 000 S 28.4  0.0  13:20.31 [migration/22]



   95 root  RT   0 000 S 23.8  0.0  13:37.31 [migration/30]



26518 postgres  20   0 8601m 9684 5228 S 21.2  0.0   0:01.89 postgres:
other_user x(52954) INSERT


6 root  RT   0 000 S 20.5  0.0  39:17.72 [migration/0]



   41 root  RT   0 000 S 19.6  0.0  18:21.36 [migration/12]



   68 root  RT   0 000 S 19.6  0.0  13:04.62 [migration/21]



   74 root  RT   0 000 S 18.9  0.0  13:39.41 [migration/23]



  305 root  20   0 000 S 18.3  0.0  11:34.52 [kworker/27:1]



   44 root  RT   0 000 S 17.0  0.0  18:30.71 [migration/13]



   89 root  RT   0 000 S 16.0  0.0  12:13.42 [migration/28]



7 root  RT   0 000 S 15.3  0.0  21:58.56 [migration/1]



   35 root  RT   0 000 S 15.3  0.0  20:02.05 [migration/10]



   53 root  RT   0 000 S 14.0  0.0  12:51.46 [migration/16]



11254 root   0 -20 21848 7532 2788 S 11.7  0.0  22:35.66 atop 1



   14 root  RT   0 000 S 10.8  0.0  19:36.56 [migration/3]



26463 postgres  20   0 8601m 7492 5100 R 10.8  0.0   0:00.33 postgres:
other_user x(32835) INSERT


   32 root  RT   0 000 S 10.1  0.0  20:46.18 [migration/9]



16793 root  20   0 000 S  6.5  0.0   1:12.72 [kworker/25:0]



   20 root  RT   0 000 S  5.5  0.0  18:51.81 [migration/5]



   48 root  20   0 000 S  5.5  0.0   3:52.93 [kworker/14:0]



On Tue, Apr 2, 2013 at 10:16 AM, Armand du Plessis  wrote:

> Touch wood but I think I found the problem thanks to these pointers. I
> checked the vm.zone_reclaim_mode and mine was set to 0. However just
> before the locking starts I can see many of my CPUs flashing red and jump
> to high percentage sys usage. When I look at top it's the migration kernel
> tasks that seem to trigger it.
>
> So it seems it was a bit trigger happy with task migrations, setting the 
> kernel.sched_migration_cost
> to 500 (5ms) seemed to have resolved my woes. I'm yet to see locks
> climb and it's been running stable for a bit. This post was invaluable in
> explaining the cause ->
> http://www.postgresql.org/message-id/50e4aab1.9040...@optionshouse.com
>
> # Postgres Kernel Tweaks
> kernel.sched_migration_cost = 500
> # kernel.sched_autogroup_enabled = 0
>
> The second recommended setting 'sched_autogroup_enabled' is not available
> on the kernel I'm running but it doesn't seem to be a problem.
>
> Again, thanks again for the help. It was seriously appreciated. Long night
> was long.
>
> If things change and the problem pops up again I'll update you guys.
>
> Cheers,
>
> Armand
>
>
> On Tue, Apr 2, 2013 at 8:43 AM, Mark Kirkwood <
> mark.kirkw...@catalyst.net.nz> wrote:
>
>> Also it is worth checking what your sysctl vm.zone_reclaim_mode is set to
>> - if 1 then override to 0. As Jeff mentioned, this gotcha for larger cpu
>> number machines has been discussed at length on this list - but still traps
>> us now and again!
>>
>> Cheers
>>
>> Mark
>
>


[PERFORM] Re: Join between 2 tables always executes a sequential scan on the larger table

2013-04-02 Thread Igor Neyman
From: Dieter Rehbein [mailto:dieter.rehb...@skiline.cc] 
Sent: Tuesday, April 02, 2013 4:52 AM
To: pgsql-performance@postgresql.org
Subject: Join between 2 tables always executes a sequential scan on the larger 
table

Hi everybody,

in a project I have a performance problem, which I (and my colleagues) don't 
understand. It's a simple join between 2 of 3 tables:

table-1:   user   (id, user_name, ...).   This table has about 1 million rows 
(999673 rows)
table-2:   competition (57 rows)
table-3:   user_2_competition.  A relation between user and competition. This 
table has about 100.000 rows

The query is a join between table user_2_competition and user and looks like 
this:

select u.id, u.user_name
from user_2_competition uc 
          left join "user" u on u.id = uc.user_id 
where uc.competition_id = '3cc1cb9b3ac132ad013ad01316040001'

The query returns the ID and user_name of all users participating in a 
competition.

What I don't understand: This query executes a sequential scan on user!


The tables have the following indexes:

user_2_competition:  there is an index on user_id and an index on 
competition_id (competition_id is a VARCHAR(32) containing UUIDs)
user:  id is the primary key and has therefore a unique index (the ID is a 
VARCHAR(32), which contains UUIDs).

The database has just been restored from a backup, I've executed ANALYZE for 
both tables.

The output of explain analyze (Postgres 9.2.3):

Hash Left Join  (cost=111357.64..126222.29 rows=41396 width=42) (actual 
time=1982.543..2737.331 rows=41333 loops=1)
  Hash Cond: ((uc.user_id)::text = (u.id)::text)
  ->  Seq Scan on user_2_competition uc  (cost=0.00..4705.21 rows=41396 
width=33) (actual time=0.019..89.691 rows=41333 loops=1)
        Filter: ((competition_id)::text = 
'3cc1cb9b3ac132ad013ad01316040001'::text)
        Rows Removed by Filter: 80684
  ->  Hash  (cost=90074.73..90074.73 rows=999673 width=42) (actual 
time=1977.604..1977.604 rows=999673 loops=1)
        Buckets: 2048  Batches: 128  Memory Usage: 589kB
        ->  Seq Scan on "user" u  (cost=0.00..90074.73 rows=999673 width=42) 
(actual time=0.004..1178.827 rows=999673 loops=1)
Total runtime: 2740.723 ms


I expected to see an index-scan on user_2_competition with a hash join to user, 
not a sequential scan on user.  I've tried this with Postgres 9.1 and 9.2.3).

Any ideas, what's going on here?

With EXPLAIN ANALYZE I can see, which query plan Postgres is using.  Is there 
any way to find out, WHY postgres uses this query plan? 

best regards
Dieter

---

Dieter, 
why do you think index-scan on user_2_competition would be better?

Based on huge number of rows returned (41333 out of total ~12 in the table) 
from this table optimizer decided that Seq Scan is better than index scan.
You don't show QUERY TUNING parameters from Postgresql.conf, are they default?
Playing with optimizer parameters (lowering random_page_cost, lowering 
cpu_index_tuple_cost , increasing effective_cache_size, or just setting 
enable_seqscan = off), you could try to force "optimizer" to use index, and see 
if you are getting better results.

Regards,
Igor Neyman


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Re: Join between 2 tables always executes a sequential scan on the larger table

2013-04-02 Thread Tom Lane
Igor Neyman  writes:
> The output of explain analyze (Postgres 9.2.3):

> Hash Left Join  (cost=111357.64..126222.29 rows=41396 width=42) (actual 
> time=1982.543..2737.331 rows=41333 loops=1)
>   Hash Cond: ((uc.user_id)::text = (u.id)::text)
>   ->  Seq Scan on user_2_competition uc  (cost=0.00..4705.21 rows=41396 
> width=33) (actual time=0.019..89.691 rows=41333 loops=1)
> Filter: ((competition_id)::text = 
> '3cc1cb9b3ac132ad013ad01316040001'::text)
> Rows Removed by Filter: 80684
>   ->  Hash  (cost=90074.73..90074.73 rows=999673 width=42) (actual 
> time=1977.604..1977.604 rows=999673 loops=1)
> Buckets: 2048  Batches: 128  Memory Usage: 589kB
> ->  Seq Scan on "user" u  (cost=0.00..90074.73 rows=999673 width=42) 
> (actual time=0.004..1178.827 rows=999673 loops=1)
> Total runtime: 2740.723 ms


> I expected to see an index-scan on user_2_competition with a hash join to 
> user, not a sequential scan on user.  I've tried this with Postgres 9.1 and 
> 9.2.3).

According to the numbers, you're fetching about a third of the
user_2_competition table, which is well past the point where an
indexscan is of any use.  It's picking the seqscan because it thinks
that's faster, and I'm sure it's right.

The aspect of this plan that actually seems a bit dubious is that it's
hashing the larger input table rather than the smaller one.  There's
a thread going on about that in -hackers right now; we think it's
probably putting too much emphasis on the distribution of the join key
as opposed to the size of the table.

One thing that would help is increasing work_mem --- it looks like you
are using the default 1MB.  Cranking that up to a few MB would reduce
the number of hash batches needed.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Problems with pg_locks explosion

2013-04-02 Thread Armand du Plessis
It's now running as expected, I made a few other tweaks to get it to an
operational state again. So just for closure on this dark period below some
notes.

There was two triggers that caused the almost instant backlog of locks. As
suspected the one was scheduler that caused endless problems whenever it
started migrating tasks. This would lock up the database (and server) for a
second or more after which a few thousand locks existed.

Setting the kernel.sched_migration_cost to 5ms didn't have the desired
effect. The scheduler would still stop the world and break down for a few
seconds. After much anguish and research (this is a pretty good explanation
of the scheduler tunables
http://events.linuxfoundation.org/slides/2011/linuxcon/lcna2011_rajan.pdf)
I adjusted :

sysctl -w kernel.sched_min_granularity_ns=900
sysctl -w kernel.sched_wakeup_granularity_ns=1200

Since then I haven't had an interruption from migration.

I also switched hugepage to not be as aggressive, it was also intrusive and
my Postgres not configured to use it. "echo madvise >
/sys/kernel/mm/transparent_hugepage/defrag"

After these changes things started smoothing out and running like it
should. I also found that if you are running on striped EBS volumes you
should really try and get them busy to get consistent performance. I
checked with Amazon and the usage I see on the individual modules were
correct. (Not the RAID figure but the numbers on the individual volumes,
they were idling)

Thanks again for all the help and suggestions.

Armand


On Tue, Apr 2, 2013 at 11:55 AM, Armand du Plessis  wrote:

> Jumped the gun a bit. the problem still exists like before. But it's
> definitely on the right track, below is the output from top in the seconds
> before the cluster locks up. For some reason still insisting on moving
> tasks around despite bumping the sched_migration_cost cost up to 100ms.
>
> 77 root  RT   0 000 S 32.3  0.0  13:55.20 [migration/24]
>
>
>
> 26512 postgres  20   0 8601m 7388 4992 R 32.3  0.0   0:02.17 postgres:
> other_user (52944) INSERT
>
>
>38 root  RT   0 000 S 31.3  0.0  17:26.15
> [migration/11]
>
>65 root
>  RT   0 000 S 30.0  0.0  13:18.66 [migration/20]
>
>
>
>62 root  RT   0 000 S 29.7  0.0  12:58.81
> [migration/19]
>
>
>47 root  RT   0 000 S 29.0  0.0  18:16.43
> [migration/14]
>
>
>29 root  RT   0 000 S 28.7  0.0  25:21.47 [migration/8]
>
>
>
>71 root  RT   0 000 S 28.4  0.0  13:20.31
> [migration/22]
>
>
>95 root  RT   0 000 S 23.8  0.0  13:37.31
> [migration/30]
>
>
> 26518 postgres  20   0 8601m 9684 5228 S 21.2  0.0   0:01.89 postgres:
> other_user x(52954) INSERT
>
>
> 6 root  RT   0 000 S 20.5  0.0  39:17.72 [migration/0]
>
>
>
>41 root  RT   0 000 S 19.6  0.0  18:21.36
> [migration/12]
>
>
>68 root  RT   0 000 S 19.6  0.0  13:04.62
> [migration/21]
>
>
>74 root  RT   0 000 S 18.9  0.0  13:39.41
> [migration/23]
>
>
>   305 root  20   0 000 S 18.3  0.0  11:34.52
> [kworker/27:1]
>
>
>44 root  RT   0 000 S 17.0  0.0  18:30.71
> [migration/13]
>
>
>89 root  RT   0 000 S 16.0  0.0  12:13.42
> [migration/28]
>
>
> 7 root  RT   0 000 S 15.3  0.0  21:58.56 [migration/1]
>
>
>
>35 root  RT   0 000 S 15.3  0.0  20:02.05
> [migration/10]
>
>
>53 root  RT   0 000 S 14.0  0.0  12:51.46
> [migration/16]
>
>
> 11254 root   0 -20 21848 7532 2788 S 11.7  0.0  22:35.66 atop 1
>
>
>
>14 root  RT   0 000 S 10.8  0.0  19:36.56 [migration/3]
>
>
>
> 26463 postgres  20   0 8601m 7492 5100 R 10.8  0.0   0:00.33 postgres:
> other_user x(32835) INSERT
>
>
>32 root  RT   0 000 S 10.1  0.0  20:46.18 [migration/9]
>
>
>
> 16793 root  20   0 000 S  6.5  0.0   1:12.72
> [kworker/25:0]
>
>
>20 root  RT   0 000 S  5.5  0.0  18:51.81 [migration/5]
>
>
>
>48 root  20   0 000 S  5.5  0.0   3:52.93
> [kworker/14:0]
>
>
> On Tue, Apr 2, 2013 at 10:16 AM, Armand du Plessis  wrote:
>
>> Touch wood but I think I found the problem thanks to these pointers. I
>> checked the vm.zone_reclaim_mode and mine was set to 0. However just
>> before the locking starts I can see many of my CPUs flashing red and jump
>> to high percentage sys usage. When I look at top it's the migration kernel
>> tasks that seem to trigger it.
>>
>> So it seems it was a bit trigger happy with task migrations, setting the 
>> kernel.sched_migration_cost
>> to 500 (5ms) seemed to have resolved my woes. I'm yet to see locks
>> climb and it's been running stable for a bit. This post was invaluable in
>> explaining the cause ->
>> http://www.postgresql.org/

[PERFORM] Planner is getting wrong row count

2013-04-02 Thread AI Rumman
Why is the following query getting wrong estimation of rows?
I am using Postgresql 9.2.1 with default_statistics_target = 100.
I execute vacuum analyze each night.

 explain analyze
SELECT
entity.id AS "Leads_id", entity.type AS "Leads_type" ,
leads.firstname AS "Leads_firstname",
leads.lastname AS "Leads_lastname"
FROM leads
INNER JOIN entity ON leads.leadid=entity.id
 LEFT JOIN groups ON groups.groupid = entity.smownerid
 LEFT join users ON entity.smownerid= users.id
WHERE entity.type='Leads' AND entity.deleted=0  AND leads.converted=0
 Hash Join  (cost=14067.90..28066.53 rows=90379 width=26) (actual
time=536.009..1772.910 rows=337139 loops=1)
   Hash Cond: (leads.leadid = entity.id)
   ->  Seq Scan on leads  (cost=0.00..7764.83 rows=533002 width=18) (actual
time=0.008..429.576 rows=532960 loops=1)
 Filter: (converted = 0)
   ->  Hash  (cost=9406.25..9406.25 rows=372932 width=16) (actual
time=535.800..535.800 rows=342369 loops=1)
 Buckets: 65536  Batches: 1  Memory Usage: 16049kB
 ->  Index Scan using entity_type_idx on entity
 (cost=0.00..9406.25 rows=372932 width=16) (actual time=0.030..305.250
rows=342369 loops=1)
   Index Cond: ((type)::text = 'Leads'::text)


\d leads
 Table "public.leads"
  Column  |  Type  |   Modifiers

--++---
 leadid   | integer| not null
 email| character varying(100) |
 interest | character varying(50)  |
 firstname| character varying(100) |
 salutation   | character varying(200) |
 lastname | character varying(100) | not null
 company  | character varying(200) | not null
 annualrevenue| integer| default 0
 industry | character varying(200) |
 campaign | character varying(30)  |
 rating   | character varying(200) |
 leadstatus   | character varying(50)  |
 leadsource   | character varying(200) |
 converted| integer| default 0
 designation  | character varying(200) | default 'SalesMan'::character
varying
 licencekeystatus | character varying(50)  |
 space| character varying(250) |
 comments | text   |
 priority | character varying(50)  |
 demorequest  | character varying(50)  |
 partnercontact   | character varying(50)  |
 productversion   | character varying(20)  |
 product  | character varying(50)  |
 maildate | date   |
 nextstepdate | date   |
 fundingsituation | character varying(50)  |
 purpose  | character varying(50)  |
 evaluationstatus | character varying(50)  |
 transferdate | date   |
 revenuetype  | character varying(50)  |
 noofemployees| integer|
 yahooid  | character varying(100) |
 assignleadchk| integer| default 0
 department   | character varying(200) |
 emailoptout  | character varying(3)   | default 0
 siccode  | character varying(50)  |
Indexes:
"leads_pkey" PRIMARY KEY, btree (leadid)
"ftx_en_leads_company" gin (to_tsvector('v_en'::regconfig,
for_fts(company::text)))
"ftx_en_leads_email" gin (to_tsvector('v_en'::regconfig,
for_fts(email::text)))
"ftx_en_leads_emailoptout" gin (to_tsvector('v_en'::regconfig,
for_fts(emailoptout::text)))
"ftx_en_leads_firstname" gin (to_tsvector('v_en'::regconfig,
for_fts(firstname::text)))
"ftx_en_leads_lastname" gin (to_tsvector('v_en'::regconfig,
for_fts(lastname::text)))
"ftx_en_leads_yahooid" gin (to_tsvector('v_en'::regconfig,
for_fts(yahooid::text)))
"leads_converted_idx" btree (converted)
"leads_leadsource_idx" btree (leadsource)
"leads_leadstatus_idx" btree (leadstatus)



\d entity
 Table "public.entity"
   Column   |Type |  Modifiers

+-+--
 id  | integer | not null
 smcreatorid| integer | not null default 0
 smownerid  | integer | not null default 0
 modifiedby | integer | not null default 0
 setype | character varying(30)   | not null
 description| text|
 createdtime| timestamp without time zone | not null
 modifiedtime   | timestamp without time zone | not null
 viewedtime | timestamp without time zone |
 status | character varying(50)   |
 version| integer | not null default 0
 presence   | integer | default 1
 deleted| integer | not null default 0
 owner_type | character(1)| not null default
'U'::bpchar
 last_a