I've collected all relevant info(I think so at least) and put it here: The table in question is used to keep filepath data, of files on a harddrive. The query in question is used to retrieve items which should be backed up, but have not yet been.
The relevant columns of the table: Table "public.item" Column | Type | Collation | Nullable | Default ------------------------------------+-----------------------------+-----------+----------+---------------------------------------------- id | bigint | | not null | nextval('item_id_seq'::regclass) shouldbebackedup | boolean | | not null | true backupperformed | boolean | | not null | false itemCreated | timestamp without time zone | | | now() filepath | text | | | The existing index, which no longer gets used: "index_in_question" btree (shouldbebackedup, backupperformed, itemCreated, filepath) WHERE shouldbebackedup = true AND backupperformed = false The new index, made out of the exact same columns and conditions, get used immediately after creation: CREATE INDEX CONCURRENTLY index_test ON item USING btree (shouldbebackedup, backupperformed, itemCreated, filepath) WHERE shouldbebackedup = true AND backupperformed = false; The query in question will look something like this: select * from item where shouldbebackedup=true and itemCreated<='2020-06-05 00:00:00.000' and backupperformed=false order by filepath asc, id asc limit 100 offset 10400; Having done a count, there are around 13000 items here, without the offset and limit. That being said, the amount is entirely dependant on what was added on a previous day. I tried creating an extended statistic, like this, but it had no effect: CREATE STATISTICS s1 (dependencies) ON shouldbebackedup, backupperformed FROM item; Settings from the conf file I think are related: shared_buffers = 1024MB effective_cache_size = 2048MB random_page_cost = 1.1 effective_io_concurrency = 200 work_mem = 32MB Finally, I state again that this database gets a nightly "vacuum analyze". My thanks for looking at this and any suggestions one might have. Regards, Koen On Thu, Jun 4, 2020 at 7:08 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > Adrian Klaver <adrian.kla...@aklaver.com> writes: > > On 6/4/20 9:43 AM, Tom Lane wrote: > >> It's possible that the index had bloated to the point where the planner > >> thought it was cheaper to use a seqscan. Did you make a note of the > >> cost estimates for the different plans? > > > I missed the part where the OP pointed to a SO question. In that > > question where links to explain.depesz.com output. > > Ah, I didn't bother to chase that link either. > > So the cost estimates are only a fraction of a percent apart, making > it unsurprising for not-so-large changes in the index size to cause > a flip in the apparently-cheapest plan. The real question then is > why the cost estimates aren't actually modeling the real execution > times very well; and I'd venture that that question boils down to > why is this rowcount estimate so far off: > > > -> Parallel Seq Scan on oscar mike_three > > (cost=0.000..1934568.500 rows=2385585 width=3141) (actual > > time=159.800..158018.961 rows=23586 loops=3) > > Filter: (four AND (NOT bravo) AND (zulu <= > > 'echo'::timestamp without time zone)) > > Rows Removed by Filter: 8610174 > > We're not going to be able to answer that if the OP doesn't wish > to decloak his data a bit more ... but a reasonable guess is that > those filter conditions are correlated. With late-model Postgres > you might be able to improve matters by creating extended statistics > for this table. > > regards, tom lane >