Here's the query, obfuscated manually by me: SELECT 'Foo' as system_function, stores.name as store, lt.owner, lt.minute_of_day, lt.records FROM foo.stores LEFT OUTER JOIN (SELECT lts.store_pkey, lts.owner, date_trunc('minute', lts.date_gifted) as minute_of_day, count(*) as records FROM foo.gifts lts WHERE lts.date_added > '2017-07-14 11:13:05' AND lts.date_added < '2017-08-13 14:14:21' AND lts.date_gifted >= '2017-08-13 11:13:05' AND lts.date_gifted < '2017-08-13 14:14:21' GROUP BY 1,2,3 ORDER BY 1 ) lt ON lt.store_pkey = stores.pkey WHERE lt.records IS NOT NULL;
The foo.gifts table is pretty much the core table of our database. It's big and very active. There is an index on date_added but not yet on date_gifted. I'm working to re-write the query while the dev sees if we even need this query anymore. On Wed, Aug 15, 2018 at 2:39 PM, Adrian Klaver <adrian.kla...@aklaver.com> wrote: > On 08/15/2018 12:31 PM, Don Seiler wrote: > >> PostgreSQL 9.6.6 on CentOS. >> >> We have a report query that has gone from maybe a few seconds to run to a >> few minutes to run since mid-July. Looking at the output of EXPLAIN >> ANALYZE, the row count estimates are way off, even though this table was >> just analyzed a day or so ago. What's more bizarre to me is that the row >> count esimate is *always* 75 for every node of the plan, where the actual >> rows is in the hundreds or thousands. This table is one of the busiest >> tables in our production database (many inserts and updates). It is >> autovacuumed and autoanalyzed a few times per week, although I'm looking to >> change it to a nightly manual schedule to avoid daytime autovacuums. >> >> Hash Join (cost=1869142.34..1869146.15 rows=75 width=88) (actual >> time=179877.869..179878.011 rows=759 loops=1) >> Hash Cond: (stores.pkey = lt.store_pkey) >> Buffers: shared hit=1654593 read=331897 dirtied=249 >> -> Seq Scan on stores (cost=0.00..2.77 rows=77 width=22) (actual >> time=0.007..0.023 rows=78 loops=1) >> Buffers: shared hit=2 >> -> Hash (cost=1869141.40..1869141.40 rows=75 width=50) (actual >> time=179877.847..179877.847 rows=759 loops=1) >> Buckets: 1024 Batches: 1 Memory Usage: 73kB >> Buffers: shared hit=1654591 read=331897 dirtied=249 >> -> Subquery Scan on lt (cost=1869138.59..1869141.40 rows=75 >> width=50) (actual time=179875.976..179877.697 rows=759 loops=1) >> Buffers: shared hit=1654591 read=331897 dirtied=249 >> -> GroupAggregate (cost=1869138.59..1869140.65 rows=75 >> width=50) (actual time=179875.976..179877.606 rows=759 loops=1) >> Group Key: lts.store_pkey, lts.owner, >> (date_trunc('minute'::text, lts.date_gifted)) >> Filter: (count(*) IS NOT NULL) >> Buffers: shared hit=1654591 read=331897 dirtied=249 >> -> Sort (cost=1869138.59..1869138.78 rows=75 >> width=42) (actual time=179875.961..179876.470 rows=6731 loops=1) >> Sort Key: lts.store_pkey, >> lts.entry_source_owner, (date_trunc('minute'::text, lts.date_gifted)) >> Sort Method: quicksort Memory: 757kB >> Buffers: shared hit=1654591 read=331897 >> dirtied=249 >> -> Index Scan using gifts_date_added on >> gifts lts (cost=0.56..1869136.25 rows=75 width=42) (actual >> time=190.657..179870.165 rows=6731 loops=1) >> Index Cond: ((date_added > '2018-07-14 >> 11:13:05'::timestamp without time zone) AND (date_added < '2018-08-13 >> 14:14:21'::timestamp without time zone)) >> Filter: ((date_gifted >= '2018-08-13 >> 11:13:05'::timestamp without time zone) AND (date_gifted < '2018-08-13 >> 14:14:21'::timestamp without time zone)) >> Rows Removed by Filter: 938197 >> Buffers: shared hit=1654591 read=331897 >> dirtied=249 >> Planning time: 0.426 ms >> Execution time: 179893.894 ms >> >> I don't have a version of this query from prior to this summer, but >> getting explain plan for older data from older sandboxes show a similar >> plan. >> > > I don't have an answer, just a question: > > Can you provide the actual query and the table schema? > > >> Sidenote: I am suggesting that an index be added on the date_gifted field >> as that is far more selective and avoids throwing rows away. However I'm >> very interested in why every node dealing with the gifts table thinks >> rows=75 when the actual is much, much higher. And 75 seems like too round >> of a number to be random? >> >> -- >> Don Seiler >> www.seiler.us <http://www.seiler.us> >> > > > -- > Adrian Klaver > adrian.kla...@aklaver.com > -- Don Seiler www.seiler.us