The requested result: https://explain.depesz.com/s/G7mU
Also, the data from the statistic itself: => SELECT stxname, stxkeys, stxdependencies -> FROM pg_statistic_ext -> WHERE stxname = 's1'; stxname | stxkeys | stxdependencies ---------+---------+----------------- s1 | 29 35 | <NULL> On Fri, Jun 5, 2020 at 4:15 PM Adrian Klaver <adrian.kla...@aklaver.com> wrote: > On 6/5/20 7:05 AM, Koen De Groote wrote: > > I've collected all relevant info(I think so at least) and put it here: > > > > The table in question is used to keep filepath data, of files on a > > harddrive. > > The query in question is used to retrieve items which should be backed > > up, but have not yet been. > > > > The relevant columns of the table: > > > > Table "public.item" > > Column | Type | > > Collation | Nullable | Default > > > ------------------------------------+-----------------------------+-----------+----------+---------------------------------------------- > > id | bigint | > > | not null | nextval('item_id_seq'::regclass) > > shouldbebackedup | boolean | > > | not null | true > > backupperformed | boolean | > > | not null | false > > itemCreated | timestamp without time zone | > > | | now() > > filepath | text | > > | | > > > > > > The existing index, which no longer gets used: > > "index_in_question" btree (shouldbebackedup, backupperformed, > > itemCreated, filepath) WHERE shouldbebackedup = true AND backupperformed > > = false > > > > The new index, made out of the exact same columns and conditions, get > > used immediately after creation: > > CREATE INDEX CONCURRENTLY index_test ON item USING btree > > (shouldbebackedup, backupperformed, itemCreated, filepath) WHERE > > shouldbebackedup = true AND backupperformed = false; > > > > > > The query in question will look something like this: > > select * from item where shouldbebackedup=true and > > itemCreated<='2020-06-05 00:00:00.000' and backupperformed=false order > > by filepath asc, id asc limit 100 offset 10400; > > The result of EXPLAIN ANALYZE for above. > > > > > Having done a count, there are around 13000 items here, without the > > offset and limit. > > That being said, the amount is entirely dependant on what was added on a > > previous day. > > > > > > I tried creating an extended statistic, like this, but it had no effect: > > CREATE STATISTICS s1 (dependencies) ON shouldbebackedup, backupperformed > > FROM item; > > > > Settings from the conf file I think are related: > > > > shared_buffers = 1024MB > > effective_cache_size = 2048MB > > random_page_cost = 1.1 > > effective_io_concurrency = 200 > > work_mem = 32MB > > > > Finally, I state again that this database gets a nightly "vacuum > analyze". > > > > My thanks for looking at this and any suggestions one might have. > > > > Regards, > > Koen > > > > On Thu, Jun 4, 2020 at 7:08 PM Tom Lane <t...@sss.pgh.pa.us > > <mailto:t...@sss.pgh.pa.us>> wrote: > > > > Adrian Klaver <adrian.kla...@aklaver.com > > <mailto:adrian.kla...@aklaver.com>> writes: > > > On 6/4/20 9:43 AM, Tom Lane wrote: > > >> It's possible that the index had bloated to the point where the > > planner > > >> thought it was cheaper to use a seqscan. Did you make a note of > the > > >> cost estimates for the different plans? > > > > > I missed the part where the OP pointed to a SO question. In that > > > question where links to explain.depesz.com > > <http://explain.depesz.com> output. > > > > Ah, I didn't bother to chase that link either. > > > > So the cost estimates are only a fraction of a percent apart, making > > it unsurprising for not-so-large changes in the index size to cause > > a flip in the apparently-cheapest plan. The real question then is > > why the cost estimates aren't actually modeling the real execution > > times very well; and I'd venture that that question boils down to > > why is this rowcount estimate so far off: > > > > > -> Parallel Seq Scan on oscar mike_three > > > (cost=0.000..1934568.500 rows=2385585 width=3141) (actual > > > time=159.800..158018.961 rows=23586 loops=3) > > > Filter: (four AND (NOT bravo) AND (zulu <= > > > 'echo'::timestamp without time zone)) > > > Rows Removed by Filter: 8610174 > > > > We're not going to be able to answer that if the OP doesn't wish > > to decloak his data a bit more ... but a reasonable guess is that > > those filter conditions are correlated. With late-model Postgres > > you might be able to improve matters by creating extended statistics > > for this table. > > > > regards, tom lane > > > > > -- > Adrian Klaver > adrian.kla...@aklaver.com >