I posted that in a subsequent mail, but here it is again:
I'm interested in collecting info on the distribution of data.
Can you post:
select tablename, attname, n_distinct from pg_stats
where attname = 'st_id';
tablename | attname | n_distinct
--------------+---------+------------
st | st_id | -1
seed | st_id | 164656
feed | st_id | 14250
book | st_id | 14856
legacy_st | st_id | -1
(5 rows)
I ran analyze after this, but the results were roughly the same.
select count(distinct st_id) from seed;
count
----------
40418083
(1 row)
Looks a tad bit different than the above ;^)
and also the table definition, including the PK
Table "public.seed"
Column | Type | Modifiers
--------------+-----------------------------+---------------
seed_id | bigint | not null
mc_id | character varying(20) |
st_id | bigint |
date_created | timestamp without time zone | default now()
Indexes:
"seed_pkey" PRIMARY KEY, btree (seed_id)
"seed_st_mc_id_idx" UNIQUE, btree (mc_id, st_id)
"seed_mc_id_idx" btree (mc_id)
"seed_st_id" btree (st_id)
Foreign-key constraints:
"seed_st_id_fkey" FOREIGN KEY (st_id) REFERENCES st(st_id) ON
DELETE RESTRICT
I imagine this means I need to crank up the statistics on that column.
Thanks
-Casey
On Sep 23, 2006, at 3:17 PM, Tom Lane wrote:
Casey Duncan <[EMAIL PROTECTED]> writes:
select st_id, min(seed_id) as "initial_seed_id", count(*) as
"seed_count" from seed group by st_id;
The query plan and table stats are:
QUERY PLAN
---------------------------------------------------------------------
--
HashAggregate (cost=1362694.83..1365164.68 rows=164656 width=16)
-> Seq Scan on seed (cost=0.00..964065.62 rows=53150562
width=16)
How many distinct st_id values are there really? The planner's
evidently expecting 164656 but I suppose that's wrong? What's
in pg_stats for st_id?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match