Re: BRIN index which is much faster never chosen by planner

2019-10-15 Thread David Rowley
On Wed, 16 Oct 2019 at 11:40, Justin Pryzby wrote: > It didn't occur to me at the time, but that would also allow > creating numerous, partial BRIN indices, each of which would have separate > correlation computed over just their "restricted range", which *might* also > handle your case, depending

Re: BRIN index which is much faster never chosen by planner

2019-10-15 Thread David Rowley
On Wed, 16 Oct 2019 at 05:05, Jeremy Finzel wrote: > But perhaps it would be worth exploring if there could be more detailed stats > on physical vs logical correlation, such as when ANALYZE takes its samples, > noting physical locations as well as logical values, and allowing the > correlation

Re: BRIN index which is much faster never chosen by planner

2019-10-15 Thread Justin Pryzby
This reminds me of an issue I reported several years ago where Btree index scans were chosen over seq scan of a large, INSERT-only table due to very high correlation, but performed poorly. I concluded that use of the the high "large scale" correlation on a large 50+GB table caused the planner to f

Re: BRIN index which is much faster never chosen by planner

2019-10-15 Thread Michael Lewis
Thanks for closing the loop on the data correlation question. I've been playing with BRIN indexes on a log table of sorts and this thread helped clear up some of the behavior I have been seeing. I am curious, would a partial btree index fit your needs? Perhaps the maintenance overhead is too signi

Re: BRIN index which is much faster never chosen by planner

2019-10-15 Thread Jeremy Finzel
Thank you for the thorough and thoughtful reply! Please see below. On Mon, Oct 14, 2019 at 3:48 PM David Rowley wrote: > Another thing which you might want to look at is the correlation > column in the pg_stats view for the rec_insert_time column. Previous > to 7e534adcd, BRIN index were costed

Re: BRIN index which is much faster never chosen by planner

2019-10-14 Thread David Rowley
On Tue, 15 Oct 2019 at 08:43, Jeremy Finzel wrote: > I wanted to follow up on this specific issue. Isn't this the heart of the > matter and a fundamental problem? If I want to rely on BRIN indexes as in a > straightforward case as explained in OP, but I don't know if the planner will > be nea

Re: BRIN index which is much faster never chosen by planner

2019-10-14 Thread Jeremy Finzel
> > The other issue is that the estimation of pages fetched using bitmap > heap scan is rather crude - but that's simply hard, and I don't think we > can fundamentally improve this. > I wanted to follow up on this specific issue. Isn't this the heart of the matter and a fundamental problem? If I

Re: BRIN index which is much faster never chosen by planner

2019-10-11 Thread Tomas Vondra
On Fri, Oct 11, 2019 at 09:08:05AM -0500, Jeremy Finzel wrote: On Thu, Oct 10, 2019 at 7:22 PM David Rowley wrote: The planner might be able to get a better estimate on the number of matching rows if the now() - interval '10 days' expression was replaced with 'now'::timestamptz - interval '10

Re: BRIN index which is much faster never chosen by planner

2019-10-11 Thread Jeremy Finzel
On Thu, Oct 10, 2019 at 6:13 PM Tomas Vondra wrote: > > So this seems like a combination of multiple issues. Firstly, the bitmap > index scan on rec_insert_time_brin_1000 estimate seems somewhat poor. It > might be worth increasing stats target on that column, or something like > that. Not sure,

Re: BRIN index which is much faster never chosen by planner

2019-10-11 Thread Jeremy Finzel
Dear Michael, On Thu, Oct 10, 2019 at 5:20 PM Michael Lewis wrote: > Since the optimizer is choosing a seq scan over index scan when it seems > like it has good row estimates in both cases, to me that may mean costs of > scanning index are expected to be high. Is this workload on SSD? Has the >

Re: BRIN index which is much faster never chosen by planner

2019-10-11 Thread Jeremy Finzel
On Thu, Oct 10, 2019 at 7:22 PM David Rowley wrote: > The planner might be able to get a better estimate on the number of > matching rows if the now() - interval '10 days' expression was > replaced with 'now'::timestamptz - interval '10 days'. However, care > would need to be taken to ensure the

Re: BRIN index which is much faster never chosen by planner

2019-10-11 Thread David Rowley
On Fri, 11 Oct 2019 at 17:48, Michael Lewis wrote: > > On Thu, Oct 10, 2019 at 6:22 PM David Rowley > wrote: >> The planner will just estimate the selectivity of now() - interval '10 >> days' by using DEFAULT_INEQ_SEL, which is 0., so it >> thinks it'll get 1/3rd of the table.

Re: BRIN index which is much faster never chosen by planner

2019-10-10 Thread Michael Lewis
On Thu, Oct 10, 2019 at 6:22 PM David Rowley wrote: > The planner might be able to get a better estimate on the number of > matching rows if the now() - interval '10 days' expression was > replaced with 'now'::timestamptz - interval '10 days'. However, care > would need to be taken to ensure the

Re: BRIN index which is much faster never chosen by planner

2019-10-10 Thread David Rowley
On Fri, 11 Oct 2019 at 12:13, Tomas Vondra wrote: > The index scan is estimated to return 157328135 rows, i.e. about 50% of > the table (apparently it's ~10x more than the actual number). Don't pay too much attention to the actual row counts from bitmap index scans of brin indexes. The value is e

Re: BRIN index which is much faster never chosen by planner

2019-10-10 Thread Tomas Vondra
On Thu, Oct 10, 2019 at 04:58:11PM -0500, Jeremy Finzel wrote: ... Notice it chooses the smallest BRIN index with 1000 pages per range, and this is far faster than the seq scan. I do believe the estimate is actually way off. Just a plain EXPLAIN of the latter estimates 10x more rows than actu

Re: BRIN index which is much faster never chosen by planner

2019-10-10 Thread Michael Lewis
Since the optimizer is choosing a seq scan over index scan when it seems like it has good row estimates in both cases, to me that may mean costs of scanning index are expected to be high. Is this workload on SSD? Has the random_page_cost config been decreased from default 4 (compared with cost of 1

BRIN index which is much faster never chosen by planner

2019-10-10 Thread Jeremy Finzel
Good Afternoon, I posted about this on another thread here , but the topic was not precisely planner issues, so I wanted to post it here. I am running Postgres 11.5. I have a table that is i