On Sep 19, 2006, at 1:51 AM, Simon Riggs wrote:

On Mon, 2006-09-18 at 14:08 -0700, Casey Duncan wrote:
I've reported variants of this in the past, but this case is entirely
repeatable.

Executing this query:

select st_id, min(seed_id) as "initial_seed_id", count(*) as
"seed_count"
from seed group by st_id;

The query plan and table stats are:

                               QUERY PLAN
--------------------------------------------------------------------- --
HashAggregate  (cost=1362694.83..1365164.68 rows=164656 width=16)
-> Seq Scan on seed (cost=0.00..964065.62 rows=53150562 width=16)


  relname | relpages |  reltuples
---------+----------+-------------
seed    |   428880 | 5.26984e+07


The hashagg doesn't yet have scroll to disk capability, so a bad
estimation of ndistinct will cause this to fail (at any release). This
is a known issue for certain distributions of data only. The workaround
is the one you know about already: enable_hashagg = off

Ok, is that something for the TODO list? I took a glance and didn't see it.


I'm interested in collecting info on the distribution of data.
Can you post:

select tablename, attname, n_distinct from pg_stats
where attname = 'st_id';

  tablename   | attname | n_distinct
--------------+---------+------------
st            | st_id   |         -1
seed          | st_id   |     164656
feed          | st_id   |      14250
book          | st_id   |      14856
legacy_st     | st_id   |         -1
(5 rows)

I ran analyze after this, but the results were roughly the same.

select count(distinct st_id) from seed;

  count
----------
40418083
(1 row)

Looks a tad bit different than the above ;^)

and also the table definition, including the PK

Table "public.seed"
    Column    |            Type             |   Modifiers
--------------+-----------------------------+---------------
seed_id      | bigint                      | not null
mc_id        | character varying(20)       |
st_id        | bigint                      |
date_created | timestamp without time zone | default now()
Indexes:
    "seed_pkey" PRIMARY KEY, btree (seed_id)
    "seed_st_mc_id_idx" UNIQUE, btree (mc_id, st_id)
    "seed_mc_id_idx" btree (mc_id)
    "seed_st_id" btree (st_id)
Foreign-key constraints:
"seed_st_id_fkey" FOREIGN KEY (st_id) REFERENCES st(st_id) ON DELETE RESTRICT

Thanks

-Casey



---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Reply via email to