On Fri, Aug 22, 2008 at 2:31 AM, Moritz Onken <[EMAIL PROTECTED]> wrote:
> Am 21.08.2008 um 19:08 schrieb Merlin Moncure:
>
>> On Thu, Aug 21, 2008 at 11:07 AM, Moritz Onken <[EMAIL PROTECTED]>
>> wrote:
>>>
>>> Am 21.08.2008 um 16:39 schrieb Scott Carey:
>>>
It looks to me like the work_mem d
Am 21.08.2008 um 19:08 schrieb Merlin Moncure:
On Thu, Aug 21, 2008 at 11:07 AM, Moritz Onken
<[EMAIL PROTECTED]> wrote:
Am 21.08.2008 um 16:39 schrieb Scott Carey:
It looks to me like the work_mem did have an effect.
Your earlier queries had a sort followed by group aggregate at the
to
On Thu, Aug 21, 2008 at 11:07 AM, Moritz Onken <[EMAIL PROTECTED]> wrote:
>
> Am 21.08.2008 um 16:39 schrieb Scott Carey:
>
>> It looks to me like the work_mem did have an effect.
>>
>> Your earlier queries had a sort followed by group aggregate at the top,
>> and now its a hash-aggregate. So the
Am 21.08.2008 um 16:39 schrieb Scott Carey:
It looks to me like the work_mem did have an effect.
Your earlier queries had a sort followed by group aggregate at the
top, and now its a hash-aggregate. So the query plan DID change.
That is likely where the first 10x performance gain came fr
It looks to me like the work_mem did have an effect.
Your earlier queries had a sort followed by group aggregate at the top, and
now its a hash-aggregate. So the query plan DID change. That is likely
where the first 10x performance gain came from.
The top of the plan was:
GroupAggregate (cost
Am 21.08.2008 um 09:04 schrieb Moritz Onken:
Am 20.08.2008 um 20:28 schrieb Tom Lane:
"Scott Carey" <[EMAIL PROTECTED]> writes:
The planner actually thinks there will only be 28704 rows returned
of width
12. But it chooses to sort 53 million rows before aggregating.
Thats
either a bug
Am 20.08.2008 um 20:28 schrieb Tom Lane:
"Scott Carey" <[EMAIL PROTECTED]> writes:
The planner actually thinks there will only be 28704 rows returned
of width
12. But it chooses to sort 53 million rows before aggregating.
Thats
either a bug or there's something else wrong here. That is
Am 20.08.2008 um 20:06 schrieb Scott Carey:
Ok, so the problem boils down to the sort at the end.
The query up through the merge join on domain is as fast as its
going to get. The sort at the end however, should not happen
ideally. There are not that many rows returned, and it should
h
"Scott Carey" <[EMAIL PROTECTED]> writes:
> The planner actually thinks there will only be 28704 rows returned of width
> 12. But it chooses to sort 53 million rows before aggregating. Thats
> either a bug or there's something else wrong here. That is the wrong way
> to aggregate those results
Ok, so the problem boils down to the sort at the end.
The query up through the merge join on domain is as fast as its going to
get. The sort at the end however, should not happen ideally. There are not
that many rows returned, and it should hash_aggregate if it thinks there is
enough space to do
More work_mem will make the sort fit more in memory and less on disk, even
with the same query plan.
On Wed, Aug 20, 2008 at 12:54 AM, Moritz Onken <[EMAIL PROTECTED]>wrote:
>
> Am 19.08.2008 um 17:23 schrieb Moritz Onken:
>
>
>> Am 19.08.2008 um 16:49 schrieb Scott Carey:
>>
>> What is your wo
Moritz Onken írta:
>
> Am 19.08.2008 um 17:23 schrieb Moritz Onken:
>
>>
>> Am 19.08.2008 um 16:49 schrieb Scott Carey:
>>
>>> What is your work_mem set to? The default?
>>>
>>> Try increasing it significantly if you have the RAM and seeing if
>>> that affects the explain plan. You may even want
Am 19.08.2008 um 17:23 schrieb Moritz Onken:
Am 19.08.2008 um 16:49 schrieb Scott Carey:
What is your work_mem set to? The default?
Try increasing it significantly if you have the RAM and seeing if
that affects the explain plan. You may even want to set it to a
number larger than the
Am 19.08.2008 um 16:49 schrieb Scott Carey:
What is your work_mem set to? The default?
Try increasing it significantly if you have the RAM and seeing if
that affects the explain plan. You may even want to set it to a
number larger than the RAM you have just to see what happens. In
all
What is your work_mem set to? The default?
Try increasing it significantly if you have the RAM and seeing if that
affects the explain plan. You may even want to set it to a number larger
than the RAM you have just to see what happens. In all honesty, it may be
faster to overflow to OS swap spac
Am 19.08.2008 um 14:17 schrieb Matthew Wakeling:
On Tue, 19 Aug 2008, Moritz Onken wrote:
tablename| attname | n_distinct | correlation
result | domain | 1642 | 1
Well, the important thing is the correlation, which is 1, indicating
that Postgr
On Tue, 19 Aug 2008, Moritz Onken wrote:
tablename| attname | n_distinct | correlation
result | domain | 1642 | 1
Well, the important thing is the correlation, which is 1, indicating that
Postgres knows that the table is clustered. So I have no
As far as I can tell, it should. If it is clustered on an index on
domain, and then analysed, it should no longer have to sort on domain.
Could you post here the results of running:
select * from pg_stats where attname = 'domain';
schemaname | tablename| attname | null_fra
On Tue, 19 Aug 2008, Moritz Onken wrote:
explain select
a."user", b.category, sum(1.0/b.cat_count)::float
from result a, domain_categories b
where a."domain" = b."domain"
group by a."user", b.category;
Both results and domain_categories are clustered on domain and analyzed.
Why is it still sor
Am 18.08.2008 um 18:05 schrieb Matthew Wakeling:
On Mon, 18 Aug 2008, Moritz Onken wrote:
Running the query for more than one user is indeed not much slower.
That's what I need. I'm clustering the results table on domain
right now. But why is this better than clustering it on "user"?
The
On Mon, 18 Aug 2008, Moritz Onken wrote:
Running the query for more than one user is indeed not much slower. That's
what I need. I'm clustering the results table on domain right now. But why is
this better than clustering it on "user"?
The reason is the way that the merge join algorithm works.
Well, you're getting the database to read the entire contents of the
domain_categories table in order. That's 12 million rows - a fair
amount of work.
You may find that removing the "user = 1337" constraint doesn't make
the query much slower - that's where you get a big win by clusterin
On Mon, 18 Aug 2008, Moritz Onken wrote:
"HashAggregate (cost=817397.78..817428.92 rows=2491 width=8) (actual
time=42874.339..42878.419 rows=3361 loops=1)"
" -> Merge Join (cost=748.47..674365.50 rows=19070970 width=8) (actual >
time=15702.449..42829.388 rows=36308 loops=1)"
"Merge
Am 18.08.2008 um 16:30 schrieb Matthew Wakeling:
On Mon, 18 Aug 2008, Moritz Onken wrote:
I have indexes on result.domain, domain_categories.domain,
result.user, domain_categories.category. Clustered result on user
and domain_categories on domain.
"-> Materialize (cost=2118752.2
On Mon, 18 Aug 2008, Moritz Onken wrote:
I have indexes on result.domain, domain_categories.domain, result.user,
domain_categories.category. Clustered result on user and domain_categories on
domain.
"-> Materialize (cost=2118752.28..2270064.64 rows=12104989 width=8)
(actual time=464
Hi,
I run this query:
select max(a."user"), b.category, count(1) from result a,
domain_categories b where a."domain" = b."domain" group by b.category;
the table result contains all websites a user visited. And the table
domain_categories contains all categories a domain is in.
result has 2
26 matches
Mail list logo