[SQL] Aggregate query for multiple records

2004-08-25 Thread Scott Gerhardt
Hello, I am new to the list, my apology if this question is beyond the 
scope or charter of this list.

My questions is:
What is the best method to perform an aggregate query to calculate 
sum() values for each distinct wid as in the example below, but except 
for all wid's (not just WHERE wid='01/1-6-1-30w1/0').

Also, performance wise, would it be better to build a function for this 
query.  The table has 9 million records and these aggregate queries 
take hours.

SELECT
  SUM(oil) as sumoil, SUM(hours) as sumhours,
FROM
  (SELECT oil, hours prd_data WHERE wid='01/1-6-1-30w1/0'
   ORDER BY date LIMIT 6) subtable
;
Table description:
  Table "prd_data"
 Column | Type  | Modifiers
+---+---
 date   | integer   |
 hours  | real  |
 oil| real  |
 gas| real  |
 water  | real  |
 pwid   | integer   |
 wid| character varying(20) |
 year   | smallint  |
Indexes: wid_index6
Actual table (prd_data), 9 million records:
  date  | hours |  oil  | gas  | water | pwid |   wid   | year
+---+---+--+---+--+-+--
 196507 |   360 | 159.4 | 11.3 |  40.9 |  413 | 01/1-1-1-31w1/0 | 1965
 196508 |   744 |   280 |   20 |  27.2 |  413 | 01/1-1-1-31w1/0 | 1965
 196509 |   360 | 171.1 | 11.4 |  50.4 |  413 | 01/1-1-1-31w1/0 | 1965
 196510 |   744 | 202.1 |   25 |  89.8 |  413 | 01/1-1-1-31w1/0 | 1965
 196512 |   744 | 201.3 | 23.8 |  71.9 |  413 | 01/1-1-1-31w1/0 | 1965
 196511 |   720 |   184 | 17.6 |  78.9 |  413 | 01/1-1-1-31w1/0 | 1965
 196610 |   744 |  99.8 | 15.4 |  53.7 |  413 | 01/1-1-1-31w1/0 | 1966
 196612 |   744 |86 | 12.8 |  36.1 |  413 | 01/1-1-1-31w1/0 | 1966
 196611 |   720 |86 | 12.6 |  41.7 |  413 | 01/1-1-1-31w1/0 | 1966
 196601 |   744 | 191.6 | 22.6 |  50.7 |  413 | 01/1-1-1-31w1/0 | 1966
 200301 |   461 |  68.8 |   0 | 186.3 | 47899 | 9G/6-1-50-24w3/0 | 2003
 200310 |   740 | 446.3 |   0 | 563.1 | 47899 | 9G/6-1-50-24w3/0 | 2003
 200306 |   667 |  92.1 |   0 | 968.8 | 47899 | 9G/6-1-50-24w3/0 | 2003
 200304 | 0 | 0 |   0 | 0 | 47899 | 9G/6-1-50-24w3/0 | 2003
 200308 |   457 | 100.7 |   0 |  82.8 | 47899 | 9G/6-1-50-24w3/0 | 2003
 200307 |   574 |78 |   0 |   752 | 47899 | 9G/6-1-50-24w3/0 | 2003
 200312 |   582 | 360.9 |   0 |   569 | 47899 | 9G/6-1-50-24w3/0 | 2003
 200311 |   681 | 260.8 |   0 | 563.9 | 47899 | 9G/6-1-50-24w3/0 | 2003
 200305 |   452 | 0 |   0 | 0 | 47899 | 9G/6-1-50-24w3/0 | 2003
 200309 |   637 | 244.6 |   0 | 193.8 | 47899 | 9G/6-1-50-24w3/0 | 2003
(20 rows)

Thanks,
--
Scott A. Gerhardt, P.Geo.
Gerhardt Information Technologies
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] Aggregate query for multiple records

2004-08-28 Thread Scott Gerhardt

Greg Stark <[EMAIL PROTECTED]> writes:
[ nice example snipped ]
... Also, you'll have to change it to use reals.
That part, at least, can be worked around as of 7.4: use polymorphic
functions.  You can declare the functions and aggregate as working on
anyelement/anyarray, and then they will automatically work on any
datatype that has a + operator.
regression=# create or replace function first_6_accum  
(anyarray,anyelement) returns anyarray
regression-# language sql immutable as 'select case when  
array_upper($1,1)>=6 then $1 else $1||$2 end';
CREATE FUNCTION
regression=# create function sum_6(anyarray) returns anyelement  
immutable language sql as 'select  
$1[1]+$1[2]+$1[3]+$1[4]+$1[5]+$1[6]';
CREATE FUNCTION
regression=# create aggregate sum_first_6 (basetype=anyelement,  
sfunc=first_6_accum, stype=anyarray,initcond='{}',finalfunc=sum_6);
CREATE AGGREGATE
regression=# select sum_first_6(i) from (select i from (select 1 as i  
union select 2 union select 3 union select 4 union select 5 union  
select 6 union select 7 union select 8) as x order by i desc) as x;
 sum_first_6
-
  33
(1 row)

regression=# select sum_first_6(i) from (select i from (select 1.1 as  
i union select 2 union select 3 union select 4 union select 5 union  
select 6 union select 7.7 union select 8) as x order by i desc) as x;
 sum_first_6
-
33.7
(1 row)

regression=#
regards, tom lane

An alternate solution I'm thinking is to add column to hold a  
"total_months" value that could be used to simplify queries and speed  
queries ( i.e. first month of oil productin = 1, second = 2 etc.)  That  
way I can use select the first 6 months by using "where < 6", or any  
month interval for that matter.

The following query, suggested by another list member (thanks Josh  
Berkus), to populate the "total_months" column sort of work but doesn't  
handle the year wrapping as it adds 88 when the year wraps (see output  
below).

UPDATE prd_data_test SET months_prod = prd_data_test."date" -  
prd2."date" + 1
FROM prd_data_test prd2
WHERE prd_data_test.wid = prd2.wid
AND prd2."date" = ( SELECT "date" FROM prd_data_test prd3
	WHERE prd3.wid = prd2.wid
	ORDER BY "date" LIMIT 1 );

The results are:
SEM=# select * from prd_data_test order by wid, date limit 20;
  date  | hours |  oil  | gas  | water | pwid |   wid   | year  
| month_prd | months_prod
+---+---+--+---+--+-+-- 
+---+-
 196505 |   480 | 194.3 | 10.3 | 0 |1 | 01/1-6-1-30w1/0 | 1965  
|   |   1
 196506 |   600 | 279.4 | 13.1 | 0 |1 | 01/1-6-1-30w1/0 | 1965  
|   |   2
 196507 |   744 | 288.1 |  4.5 | 0 |1 | 01/1-6-1-30w1/0 | 1965  
|   |   3
 196508 |   720 | 234.6 |  9.4 |   2.9 |1 | 01/1-6-1-30w1/0 | 1965  
|   |   4
 196509 |   648 | 208.2 | 12.5 | 6 |1 | 01/1-6-1-30w1/0 | 1965  
|   |   5
 196510 |   744 | 209.8 | 15.3 | 0 |1 | 01/1-6-1-30w1/0 | 1965  
|   |   6
 196511 |   720 | 180.5 | 13.9 |  27.7 |1 | 01/1-6-1-30w1/0 | 1965  
|   |   7
 196512 |   744 | 227.4 | 22.8 |   5.2 |1 | 01/1-6-1-30w1/0 | 1965  
|   |   8
 196601 |   744 | 230.3 | 22.7 |10 |1 | 01/1-6-1-30w1/0 | 1966  
|   |  97
 196602 |   672 | 173.2 | 16.5 |17 |1 | 01/1-6-1-30w1/0 | 1966  
|   |  98
 196603 |   744 | 197.2 | 18.7 |   9.2 |1 | 01/1-6-1-30w1/0 | 1966  
|   |  99
 196604 |   720 | 168.1 | 14.1 | 3 |1 | 01/1-6-1-30w1/0 | 1966  
|   | 100

Table description:
  Table "prd_data"
 Column | Type  | Modifiers
+---+---
 date   | integer   |
 hours  | real  |
 oil| real  |
 gas| real  |
 water  | real  |
 pwid   | integer   |
 wid| character varying(20) |
 year   | smallint  |
Indexes: wid_index6
Actual table (prd_data), 9 million records:
  date  | hours |  oil  | gas  | water | pwid |   wid   | year
+---+---+--+---+--+-+--
 196507 |   360 | 159.4 | 11.3 |  40.9 |  413 | 01/1-1-1-31w1/0 | 1965
 196508 |   744 |   280 |   20 |  27.2 |  413 | 01/1-1-1-31w1/0 | 1965
 196509 |   360 | 171.1 | 11.4 |  50.4 |  413 | 01/1-1-1-31w1/0 | 1965
 196510 |   744 | 202.1 |   25 |  89.8 |  413 | 01/1-1-1-31w1/0 | 1965
 196512 |   744 | 201.3 | 23.8 |  71.9 |  413 | 01/1-1-1-31w1/0 | 1965
 196511 |   720 |   184 | 17.6 |  78.9 |  413 | 01/1-1-1-31w1/0 | 1965
 196610 |   744 |  99.8 | 15.4 |  53.7 |  413 | 01/1-1-1-31w1/0 | 1966
 196612 |   744 |86 | 12.8 |  36.1 |  413 | 01/1-1-1-31w1/0 | 1966
 196611 |   720 |86 | 12.6 |  41.7 |  413 | 01/1-1-1-31w1/0 | 1966
 196601 |   744 | 191.6 | 22.6 |  50.7 |