date:20060526

[PERFORM] column totals

2006-05-26 Thread James Neethling


Hi There,

I've got a situation where I need to pull profit information by product 
category, as well as the totals for each branch.


Basically, something like

SELECT branch_id, prod_cat_id, sum(prod_profit) as prod_cat_profit
   FROM () as b1
WHERE x = y
GROUP BY branch, prod_cat_id


Now, I also need the branch total, effectively,
SELECT branch_id, sum(prod_profit) as branch_total
   FROM () as b1
WHERE x = y
GROUP BY branch_id.


Since the actual queries for generating prod_profit are non-trivial, how 
do I combine them to get the following select list?


Or is there a more efficient way?

Kind Regards,
James


begin:vcard
fn:James  Neethling
n:Neethling;James 
org:Silver Sphere Business Solutions
adr:Centurion Business Park A2;;25633 Democracy Way;Prosperity Park;Milnerton;Cape Town;7441
email;internet:[EMAIL PROTECTED]
title:Managing Member
tel;work:27 21 552 7108
tel;fax:27 21 552 7106
tel;cell:27 83 399 2799
x-mozilla-html:FALSE
url:http://www.silversphere.co.za
version:2.1
end:vcard


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] column totals

2006-05-26 Thread James Neethling




James Neethling wrote:

Hi There,

I've got a situation where I need to pull profit information by 
product category, as well as the totals for each branch.


Basically, something like

SELECT branch_id, prod_cat_id, sum(prod_profit) as prod_cat_profit
   FROM () as b1
WHERE x = y
GROUP BY branch, prod_cat_id


Now, I also need the branch total, effectively,
SELECT branch_id, sum(prod_profit) as branch_total
   FROM () as b1
WHERE x = y
GROUP BY branch_id.


Since the actual queries for generating prod_profit are non-trivial, 
how do I combine them to get the following select list?
SELECT branch_id, prod_cat_id, sum(prod_profit) as prod_cat_profit,  
sum(prod_profit) as branch_total


Or is there a more efficient way?

Kind Regards,
James


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org
  
begin:vcard
fn:James  Neethling
n:Neethling;James 
org:Silver Sphere Business Solutions
adr:Centurion Business Park A2;;25633 Democracy Way;Prosperity Park;Milnerton;Cape Town;7441
email;internet:[EMAIL PROTECTED]
title:Managing Member
tel;work:27 21 552 7108
tel;fax:27 21 552 7106
tel;cell:27 83 399 2799
x-mozilla-html:FALSE
url:http://www.silversphere.co.za
version:2.1
end:vcard


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] is it possible to make this faster?

2006-05-26 Thread Tom Lane

"Merlin Moncure" <[EMAIL PROTECTED]> writes:
> On 5/25/06, Tom Lane <[EMAIL PROTECTED]> wrote:
>> "Merlin Moncure" <[EMAIL PROTECTED]> writes:
>>> recent versions of mysql do much better, returning same set in < 20ms.

>> Are you sure you measured that right?  I tried to duplicate this using
>> mysql 5.0.21, and I see runtimes of 0.45 sec without an index and
>> 0.15 sec with.  This compares to psql times around 0.175 sec.  Doesn't
>> look to me like we're hurting all that badly, even without using the
>> index.

> Well, my numbers were approximate, but I tested on a few different
> machines.  the times got closer as the cpu speed got faster.  pg
> really loves a quick cpu.  on 600 mhz p3 I got 70ms on mysql and
> 1050ms on pg.  Mysql query cache is always off for my performance
> testing.

Well, this bears looking into, because I couldn't get anywhere near 20ms
with mysql.  I was using a dual Xeon 2.8GHz machine which ought to be
quick enough, and the stock Fedora Core 5 RPM of mysql.  (Well, actually
that SRPM built on FC4, because this machine is still on FC4.)  I made a
MyISAM table with three integer columns as mentioned, and filled it with
about 30 rows with 2000 distinct values of (a,b) and random values
of c.  I checked the timing both in the mysql CLI, and with a trivial
test program that timed mysql_real_query() plus mysql_store_result(),
getting pretty near the same timings each way.

BTW, in pgsql it helps a whole lot to raise work_mem a bit for this
example --- at default work_mem it wants to do sort + group_aggregate,
while with work_mem 2000 or more it'll use a hash_aggregate plan which
is quite a bit faster.

It seems possible that there is some equivalently simple tuning on the
mysql side that you did and I didn't.  This is an utterly stock mysql
install, just "rpm -i" and "service mysqld start".

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] column totals

2006-05-26 Thread Ragnar

On fös, 2006-05-26 at 11:56 +0200, James Neethling wrote:

> SELECT branch_id, prod_cat_id, sum(prod_profit) as prod_cat_profit
> FROM () as b1
> WHERE x = y
> GROUP BY branch, prod_cat_id
> 
> 
> Now, I also need the branch total, effectively,
> SELECT branch_id, sum(prod_profit) as branch_total
> FROM () as b1
> WHERE x = y
> GROUP BY branch_id.
> 
> 
> Since the actual queries for generating prod_profit are non-trivial, how 
> do I combine them to get the following select list?

one simple way using temp table and 2 steps:

CREATE TEMP TABLE foo AS
  SELECT branch_id, 
 prod_cat_id, 
 sum(prod_profit) as prod_cat_profit
  FROM () as b1
  WHERE x = y
  GROUP BY branch, prod_cat_id;

SELECT branch_id, 
   prod_cat_id, 
   prod_cat_profit,
   branch_total
FROM foo as foo1 
 JOIN 
   (SELECT branch_id, 
   sum(prod_cat_profit) as branch_total
FROM foo
GROUP BY branch_id 
   ) as foo2 USING branch_id;


(untested)

gnari



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] is it possible to make this faster?

2006-05-26 Thread Merlin Moncure


On 5/26/06, Tom Lane <[EMAIL PROTECTED]> wrote:

Well, this bears looking into, because I couldn't get anywhere near 20ms
with mysql.  I was using a dual Xeon 2.8GHz machine which ought to be


did you have a key on a,b,c? if I include unimportant unkeyed field d
the query time drops from 70ms to ~ 1 second.  mysql planner is
tricky, it's full of special case optimizations...

select count(*) from (select a,b,max(c) group by a,b) q;
blows the high performance case as does putting the query in a view.

mysql> select version();
+---+
| version() |
+---+
| 5.0.16|
+---+
1 row in set (0.00 sec)

mysql> set global query_cache_size = 0;
Query OK, 0 rows affected (0.00 sec)

mysql> select user_id, acc_id, max(sample_date) from usage_samples group by 1,2
[...]
+-++--+
939 rows in set (0.07 sec)

mysql> select user_id, acc_id, max(sample_date) from usage_samples group by 1,2
[...]
+-++--+--+
939 rows in set (1.39 sec)

merlin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] is it possible to make this faster?

2006-05-26 Thread Tom Lane

"Merlin Moncure" <[EMAIL PROTECTED]> writes:
> did you have a key on a,b,c?

Yeah, I did
create index t1i on t1 (a,b,c);
Do I need to use some other syntax to get it to work?

> select count(*) from (select a,b,max(c) group by a,b) q;
> blows the high performance case as does putting the query in a view.

I noticed that too, while trying to suppress the returning of the
results for timing purposes ... still a few bugs in their optimizer
obviously.  (Curiously, EXPLAIN still claims that the index is being
used.)

> mysql> select user_id, acc_id, max(sample_date) from usage_samples group by 
> 1,2
> [...]
> +-++--+
> 939 rows in set (0.07 sec)

> mysql> select user_id, acc_id, max(sample_date) from usage_samples group by 
> 1,2
> [...]
> +-++--+--+
> 939 rows in set (1.39 sec)

I don't understand what you did differently in those two cases?
Or was there a DROP INDEX between?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] is it possible to make this faster?

2006-05-26 Thread Merlin Moncure

On 5/26/06, Tom Lane <[EMAIL PROTECTED]> wrote:

"Merlin Moncure" <[EMAIL PROTECTED]> writes:
> did you have a key on a,b,c?
Yeah, I did
create index t1i on t1 (a,b,c);
Do I need to use some other syntax to get it to work?

can't thing of anything, I'm running completely stock, did you do a
optimize table foo? is the wind blowing in the right direction?

> select count(*) from (select a,b,max(c) group by a,b) q;
> blows the high performance case as does putting the query in a view.

I noticed that too, while trying to suppress the returning of the
results for timing purposes ... still a few bugs in their optimizer
obviously.  (Curiously, EXPLAIN still claims that the index is being
used.)

well, they do some tricky things pg can't do for architectural reasons
but the special case is obviously hard to get right.  I suppose this
kinda agrues against doing all kinds of acrobatics to optimize mvcc
weak cases like the above and count(*)...better to make heap access as
quick as possible.

> mysql> select user_id, acc_id, max(sample_date) from usage_samples group by 
1,2
> 939 rows in set (0.07 sec)

> mysql> select user_id, acc_id, max(sample_date) from usage_samples group by 
1,2
> 939 rows in set (1.39 sec)

oops, pasted the wrong query..case 2 should have been
select user_id, acc_id, max(sample_date), disksize from usage_samples
group by 1,2
illustrating what going to the heap does to the time.

merlin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] is it possible to make this faster?

2006-05-26 Thread Tom Lane

"Merlin Moncure" <[EMAIL PROTECTED]> writes:
> can't thing of anything, I'm running completely stock, did you do a
> optimize table foo?

Nope, never heard of that before.  But I did it, and it doesn't seem to
have changed my results at all.

> mysql> select user_id, acc_id, max(sample_date) from usage_samples group by 
> 1,2
> 939 rows in set (0.07 sec)

0.07 seconds is not impossibly out of line with my result of 0.15 sec,
maybe your machine is just 2X faster than mine.  This is a 2.8GHz dual
Xeon EM64T, what are you testing?  You said "less than 20 msec" before,
what was that on?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] is it possible to make this faster?

2006-05-26 Thread Merlin Moncure

On 5/26/06, Tom Lane <[EMAIL PROTECTED]> wrote:

> mysql> select user_id, acc_id, max(sample_date) from usage_samples group by 
1,2
> 939 rows in set (0.07 sec)

0.07 seconds is not impossibly out of line with my result of 0.15 sec,
maybe your machine is just 2X faster than mine.  This is a 2.8GHz dual
Xeon EM64T, what are you testing?  You said "less than 20 msec" before,
what was that on?

600 mhz p3: 70 ms, 1100 ms slow case
1600 mhz p4: 10-30ms (mysql timer not very precise) 710ms slow case
quad opteron 865: 0 :-)
dual p3 1133 Mhz xeon, mysql 4.0.16: 500 ms

using steinar's 'substitute group by' for pg I get 40ms on the p3 and
low times on all else.  your time of 150 ms is looking like the slow
case on my results.

merlin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] is it possible to make this faster?

2006-05-26 Thread Tom Lane

"Merlin Moncure" <[EMAIL PROTECTED]> writes:
> your time of 150 ms is looking like the slow case on my results.

Yeah... so what's wrong with my test?  Anyone else care to duplicate
the test and see what they get?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[PERFORM] column totals

Re: [PERFORM] column totals

Re: [PERFORM] is it possible to make this faster?

Re: [PERFORM] column totals

Re: [PERFORM] is it possible to make this faster?

Re: [PERFORM] is it possible to make this faster?

Re: [PERFORM] is it possible to make this faster?

Re: [PERFORM] is it possible to make this faster?

Re: [PERFORM] is it possible to make this faster?

Re: [PERFORM] is it possible to make this faster?

10 matches

Site Navigation

Mail list logo

Footer information