Manfred Koizar <[EMAIL PROTECTED]> writes:
> On Mon, 31 Jan 2005 14:40:08 -0500, Tom Lane <[EMAIL PROTECTED]> wrote:
>> Oh, I see, you want a "max" calculation in there too. Seems reasonable.
>> Any objections?
> Yes. :-( What I said is only true in the absence of any WHERE clause
> (or join).
On Mon, 31 Jan 2005 14:40:08 -0500, Tom Lane <[EMAIL PROTECTED]> wrote:
>Manfred Koizar <[EMAIL PROTECTED]> writes:
>> That's not what I meant. I tried to say that if we have a GROUP BY
>> several columns and one of these columns alone has more than N/10
>> distinct values, there's no way to get l
Manfred Koizar <[EMAIL PROTECTED]> writes:
> That's not what I meant. I tried to say that if we have a GROUP BY
> several columns and one of these columns alone has more than N/10
> distinct values, there's no way to get less than that many groups.
Oh, I see, you want a "max" calculation in there
On Mon, 31 Jan 2005 11:20:31 -0500, Tom Lane <[EMAIL PROTECTED]> wrote:
>Already done that way.
>if (relvarcount > 1)
>clamp *= 0.1;
That's not what I meant. I tried to say that if we have a GROUP BY
several columns and one of these columns alone has more than N/10
dis
Manfred Koizar <[EMAIL PROTECTED]> writes:
> On Fri, 28 Jan 2005 10:53:33 -0500, Tom Lane <[EMAIL PROTECTED]> wrote:
>> we should consider
>> something like "clamp to size of table / 10" instead.
> ... unless a *single* grouping column is estimated to have more than
> N/10 distinct values, which s
On Fri, 28 Jan 2005 10:53:33 -0500, Tom Lane <[EMAIL PROTECTED]> wrote:
> we should consider
>something like "clamp to size of table / 10" instead.
... unless a *single* grouping column is estimated to have more than
N/10 distinct values, which should be easy to check.
Servus
Manfred
--
> From: Sailesh Krishnamurthy <[EMAIL PROTECTED]>
> > "Tom" == Tom Lane <[EMAIL PROTECTED]> writes:
>
> Tom> The only real solution, of course, is to acquire cross-column
> Tom> statistics, but I don't see that happening in the near
> Tom> future.
>
> Another approach is a hybrid
Tom Lane <[EMAIL PROTECTED]> writes:
> Greg Stark's thought about a power correction seemed interesting too, though
> again far too optimistic to trust without some good math to back it up.
Fwiw, I'm pretty sure good math is not going to back up my off-the-cuff
algorithm. But I did like the answ
> "Tom" == Tom Lane <[EMAIL PROTECTED]> writes:
Tom> The only real solution, of course, is to acquire cross-column
Tom> statistics, but I don't see that happening in the near
Tom> future.
Another approach is a hybrid hashing scheme where we use a hash table
until we run out of mem
Josh Berkus writes:
> Why 10? I'd think we could come up with a slightly less arbitrary number,
Well, it's probably within an order of magnitude of the right thing ;-).
We know we don't want 1, but 100 seems awfully optimistic.
If someone can come up with a more defensible number then I'm all
Kris Jurka <[EMAIL PROTECTED]> writes:
> The proposed change biases towards a hash plan which has no provision for
> spilling to disk. Slow is one thing, but excessive memory usage and
> possibly failing is another thing.
Keep in mind that we are replacing 7.4 code that had a serious tendency
to
On Fri, 28 Jan 2005, Tom Lane wrote:
> you don't GROUP BY unique combinations of columns over huge
> tables --- or at least, you shouldn't expect great performance if you do.
The proposed change biases towards a hash plan which has no provision for
spilling to disk. Slow is one thing, but exce
Tom,
> The only real solution, of course, is to acquire cross-column
> statistics, but I don't see that happening in the near future.
Y'know, that's been on the todo list for a while. Surely someone is inspired
for 8.1/8.2? At least for columns which are indexed together?
> As a short-term
Greg Stark <[EMAIL PROTECTED]> writes:
> So why is it any more reasonable for Postgres to assume 0 correlation than any
> other value. Perhaps Postgres should calculate these cases assuming some
> arbitrary level of correlation.
[ shrug... ] Sure, if you want to do the legwork to develop somethin
Tom Lane <[EMAIL PROTECTED]> writes:
> The reason this happens even with stats is that the algorithm for
> estimating the number of groups in a multi-group-column situation
> is basically "take the product of the number of distinct values of
> each grouping column, but clamp to the number of rows
* Tom Lane ([EMAIL PROTECTED]) wrote:
> The only real solution, of course, is to acquire cross-column
> statistics, but I don't see that happening in the near future.
That'd be nice, but sounds like alot of work.
> As a short-term hack, I am thinking that the "clamp to size of table"
> part of th
I got a complaint from a fellow Red Hatter that PG 8.0 is way slower
than 7.4 on some statistical analysis tasks he was doing. Investigation
showed that the primary problem was selection of Sort/GroupAgg in place
of HashAgg to compute some grouped aggregates. Essentially he was doing
sel
17 matches
Mail list logo