Re: Multidimensional Histograms

2024-01-10 Thread Andrei Lepikhov
On 8/1/2024 16:21, Alexander Cheshev wrote: Hi Andrei, Maybe my wording needed to be more precise. I didn't implement multidimensional histograms before, so I don't know how expensive they are. I meant that for dependency statistics over about six columns, we have a lot of combinations to compu

Re: Multidimensional Histograms

2024-01-08 Thread Alexander Cheshev
Hi Andrei, > Maybe my wording needed to be more precise. I didn't implement > multidimensional histograms before, so I don't know how expensive they > are. I meant that for dependency statistics over about six columns, we > have a lot of combinations to compute. Equi-Depth Histogram in a 6 dimens

Re: Multidimensional Histograms

2024-01-07 Thread Andrei Lepikhov
On 8/1/2024 01:36, Tomas Vondra wrote: On 1/7/24 18:26, Andrei Lepikhov wrote: On 7/1/2024 17:51, Tomas Vondra wrote: On 1/7/24 11:22, Andrei Lepikhov wrote: On 7/1/2024 06:54, Tomas Vondra wrote: It's an interesting are for experiments, no doubt about it. And if you choose to explore it, tha

Re: Multidimensional Histograms

2024-01-07 Thread Alexander Cheshev
Hi Tomas, > See section 3.2 in this paper (the "view PDF" does not work for me, but > the "source PDF" does download a postscript): I believe that you are referring to a dynamic programming approach. It is a 1-dimensional case! To find an optimal solution in the M-dimensional case is an NP-hard p

Re: Multidimensional Histograms

2024-01-07 Thread Tomas Vondra
On 1/7/24 23:53, Alexander Cheshev wrote: > Hi Tomas, > >> The thing I was proposing is that it should be possible to build >> histograms with bins adapted to density in the given region. With >> smaller buckets in areas with high density. So that queries intersect >> with fewer buckets in low-d

Re: Multidimensional Histograms

2024-01-07 Thread Alexander Cheshev
Hi Tomas, > The thing I was proposing is that it should be possible to build > histograms with bins adapted to density in the given region. With > smaller buckets in areas with high density. So that queries intersect > with fewer buckets in low-density parts of the histogram. This is how Equi-Dep

Re: Multidimensional Histograms

2024-01-07 Thread Tomas Vondra
On 1/7/24 18:26, Andrei Lepikhov wrote: > On 7/1/2024 17:51, Tomas Vondra wrote: >> On 1/7/24 11:22, Andrei Lepikhov wrote: >>> On 7/1/2024 06:54, Tomas Vondra wrote: It's an interesting are for experiments, no doubt about it. And if you choose to explore it, that's fine. But it's bett

Re: Multidimensional Histograms

2024-01-07 Thread Andrei Lepikhov
On 7/1/2024 17:51, Tomas Vondra wrote: On 1/7/24 11:22, Andrei Lepikhov wrote: On 7/1/2024 06:54, Tomas Vondra wrote: It's an interesting are for experiments, no doubt about it. And if you choose to explore it, that's fine. But it's better to be aware it may not end with a commit. For the multi

Re: Multidimensional Histograms

2024-01-07 Thread Tomas Vondra
On 1/7/24 11:22, Andrei Lepikhov wrote: > On 7/1/2024 06:54, Tomas Vondra wrote: >> It's an interesting are for experiments, no doubt about it. And if you >> choose to explore it, that's fine. But it's better to be aware it may >> not end with a commit. >> For the multi-dimensional case, I propo

Re: Multidimensional Histograms

2024-01-07 Thread Andrei Lepikhov
On 7/1/2024 06:54, Tomas Vondra wrote: It's an interesting are for experiments, no doubt about it. And if you choose to explore it, that's fine. But it's better to be aware it may not end with a commit. For the multi-dimensional case, I propose we first try to experiment with the various algorith

Re: Multidimensional Histograms

2024-01-06 Thread Tomas Vondra
On 1/6/24 01:00, Alexander Cheshev wrote: > Hi Tomas, > >> Another reason was that the algorithm described in the two papers you >> reference (1988 paper by DeWitt and the evaluation by Carlson and >> Sutherland from ~2010) is simple but has pretty obvious weaknesses. It >> processes the column

Re: Multidimensional Histograms

2024-01-06 Thread Alexander Cheshev
Hi Tomas, I am sorry I didn't look into the code carefully. Indeed Postgres uses Equi-Depth Histogram: delta = (nvals - 1) / (num_hist - 1); Regards, Alexander Cheshev On Sat, 6 Jan 2024 at 01:00, Alexander Cheshev wrote: > > Hi Tomas, > > > Another reason was that the algorithm described in

Re: Multidimensional Histograms

2024-01-05 Thread Alexander Cheshev
Hi Tomas, > Another reason was that the algorithm described in the two papers you > reference (1988 paper by DeWitt and the evaluation by Carlson and > Sutherland from ~2010) is simple but has pretty obvious weaknesses. It > processes the columns one by one - first build bucket on column "a", > th

Re: Multidimensional Histograms

2023-12-28 Thread Tomas Vondra
On 12/27/23 22:19, Tomas Vondra wrote: > Hello Alexander, > > We actually had histograms in the early patches adding multivariate > statistics [1]. We however ended up removing histograms and only kept > the simpler types, for a couple reasons. > > It might be worth going through the discussion -

Re: Multidimensional Histograms

2023-12-27 Thread Tomas Vondra
Hello Alexander, We actually had histograms in the early patches adding multivariate statistics [1]. We however ended up removing histograms and only kept the simpler types, for a couple reasons. It might be worth going through the discussion - I'm sure one of the reasons was we needed to limit t