here's the explain analyze result : http://explain.depesz.com/s/Mvv and http://explain.depesz.com/s/xxF9
it seems that i need to dig more on query planner parameter. BTW, thanks all for the helps. On Sat, Aug 23, 2014 at 4:33 PM, Alban Hertroys <haram...@gmail.com> wrote: > On 23 Aug 2014, at 4:34, Soni M <diptat...@gmail.com> wrote: > > On Fri, Aug 22, 2014 at 9:10 PM, Alban Hertroys <haram...@gmail.com> > wrote: > > On 22 August 2014 14:26, Soni M <diptat...@gmail.com> wrote: > > > Currently we have only latest_transmission_id as FK, described here : > > > TABLE "ticket" CONSTRAINT "fkcbe86b0c6ddac9e" FOREIGN KEY > > > (latest_transmission_id) REFERENCES transmission_base(transmission_id) > > > > > > Change the query to include only FK still result the same: > > > explain select t.ticket_id , > > > tb.transmission_id > > > from ticket t, > > > transmission_base tb > > > where t.latest_transmission_id = tb.transmission_id > > > and tb.parse_date > ('2014-07-31'); > > > QUERY PLAN > > > > ---------------------------------------------------------------------------------------------------------------------------------- > > > Hash Join (cost=113928.06..2583606.96 rows=200338 width=8) > > > Hash Cond: (t.latest_transmission_id = tb.transmission_id) > > > -> Seq Scan on ticket t (cost=0.00..1767767.26 rows=69990826 > width=8) > > > -> Hash (cost=108923.38..108923.38 rows=400374 width=4) > > > -> Index Scan using transmission_base_by_parse_date on > > > transmission_base tb (cost=0.00..108923.38 rows=400374 width=4) > > > Index Cond: (parse_date > '2014-07-31 > 00:00:00'::timestamp > > > without time zone) > > > (6 rows) > > > > Do you have an index on ticket (latest_transmission_id)? > > > > Yes, both t.latest_transmission_id and tb.transmission_id is indexed. > > > > Indexes: > > "transmission_base_pkey" PRIMARY KEY, btree (transmission_id) CLUSTER > > Indexes: > > "ticket_by_latest_transmission" btree (latest_transmission_id) > > Okay, so we got those indexes. So much for the low-hanging fruit. > > From the above plan we learn that the database estimates[1] that 400k rows > from transmission match your condition (parse_date > '2014-07-31’). The > ticket table has a foreign key to that table, which suggests a 1:n > relationship. It also has around 70M rows, or at least the database seems > to think that about that amount will match those 400k transmissions. > > That means that if on average 175 (=70M/400k) ticket ID’s match a > transmission ID, the database would be needing all those 70M rows anyway - > and even if it only needs every 175th row, a sequential scan is not a > particularly inefficient way to go about this. > The alternative is a whole lot of index lookups, probably not in the same > order as either the index or the rows on disk, meaning quite a bit of > random disk I/O. > > I’m suspecting that the cost estimates for this query with seq-scans > disabled aren’t very different, provided doing so comes up with a > comparable plan. > > Things you might want to verify/try: > * Are those estimated numbers of rows accurate? If not, is autovacuum (or > scheduled vacuum) keeping up with the amount of data churn on these tables? > Do you collect a sufficiently large sample for the statistics? > * How much bloat is in these tables/indexes? > * Did you change planner settings (such as disabling bitmap scans; I kind > of expected one here) or did you change cost estimate parameters? > * Does it help to put an index on transmission (parse_date, > transmission_id)? > * If none of that helps, we’re going to need the output of explain analyze > - that will probably take long to create, so you might as well start with > that and do the other stuff at the side. > > What kind of hardware are these disks on? Is it possible that disk I/O on > this particular machine is relatively slow (relative to the seq/random cost > factor for disk access as specified in your postgresql.conf)? > > Cheers, > > Alban Hertroys > > [1] You did not provide explain analyse output, so we only have estimates > to work with. > -- > If you can't see the forest for the trees, > cut the trees and you'll find there is no forest. > > -- Regards, Soni Maula Harriz