[SQL] distinct / group by assistance.

2008-06-28 Thread Gavin 'Beau' Baumanis

Hi Everyone,

I am having some issues trying to create the required SQL that will  
allow me to return the results I am after.
I have been trying various incarnations, using group by, sub-queries -  
albeit to no avail - for the past three hours.



Consider the following simple SQL;

select
a.foo,
b.bar,
c.something
from
table1 a,
inner join
table2 b on b.id =a.id
left outer join
table3 on c.id = a.id

If there a multiple rows of the same id in table1, I get all  
(multiple) rows - as you would expect - of course.


What I need however, is only one row returned per instance a.id that  
is returned by the above query.


I thought of using group by - but there are no calculated fields...  
and the real query contains 32 fields, which according to the errors I  
ran into while trying to get this working, would all need to be  
included in the group by clause.


So my understanding of group by is obviously a little dodgy - and  
obviously not quite what I was expecting.
I tried using a sub query and select distinct a.id. but that  
didn't exactly help either.


I have deliberately included in my example the fact there is an outer  
join too - I am not sue if that matters or not... but just in case  
thought it prudent to include it here.



Thanks in advance for any help you might have.

Beau.

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] distinct / group by assistance.

2008-06-28 Thread Tom Lane
"Gavin 'Beau' Baumanis" <[EMAIL PROTECTED]> writes:
> ... If there a multiple rows of the same id in table1, I get all  
> (multiple) rows - as you would expect - of course.

> What I need however, is only one row returned per instance a.id that  
> is returned by the above query.

You need GROUP BY a.id.

> I thought of using group by - but there are no calculated fields...  
> and the real query contains 32 fields, which according to the errors I  
> ran into while trying to get this working, would all need to be  
> included in the group by clause.

No, you wouldn't want to do that, because then you'd be back to multiple
rows per a.id value.

The problem here is that for any one a.id value there could be multiple
values of the other variables (coming from different rows) and so the
query results are not well defined if you just add "GROUP BY a.id".
What you need to do is decide which of those values you want and use
an aggregate function to get it.  So your query might end up looking
like
select a.id, min(a.foo), avg(b.bar), ... from ... group by a.id;

regards, tom lane

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] distinct / group by assistance.

2008-06-28 Thread Gavin 'Beau' Baumanis

Hi Tom I am a nut.

please find below my correct requirements.

select
   a.foo,
   b.bar,
   c.something
from
table1 a,
inner join
table2 b on b.id =a.id
left outer join
table3 on c.id = a.id

If there a multiple rows of the same id in table2,(one (a) to many (b)  
relationship)

 I get all (multiple) rows - as you would expect - of course.

What I need however, is only one row returned per instance of a.id  
that is returned by the above query - the one with the greatest  
b.primaryKey would be ok.


I thought of using group by - but there are no calculated fields...  
and the real query contains 32 fields, which according to the errors I  
ran into while trying to get this working, would all need to be  
included in the group by clause.


So my understanding of group by is obviously a little dodgy - and  
obviously not quite what I was expecting.
I tried using a sub query and select distinct a.id. but that  
didn't exactly help either.


I have deliberately included in my example the fact there is an outer  
join too - I am not sue if that matters or not... but just in case  
thought it prudent to include it here.

On 29/06/2008, at 1:43 AM, Tom Lane wrote:


"Gavin 'Beau' Baumanis" <[EMAIL PROTECTED]> writes:

... If there a multiple rows of the same id in table1, I get all
(multiple) rows - as you would expect - of course.



What I need however, is only one row returned per instance a.id that
is returned by the above query.


You need GROUP BY a.id.


I thought of using group by - but there are no calculated fields...
and the real query contains 32 fields, which according to the  
errors I

ran into while trying to get this working, would all need to be
included in the group by clause.


No, you wouldn't want to do that, because then you'd be back to  
multiple

rows per a.id value.

The problem here is that for any one a.id value there could be  
multiple

values of the other variables (coming from different rows) and so the
query results are not well defined if you just add "GROUP BY a.id".
What you need to do is decide which of those values you want and use
an aggregate function to get it.  So your query might end up looking
like
select a.id, min(a.foo), avg(b.bar), ... from ... group by a.id;

regards, tom lane


Please contact me if you should have any questions.

Gavin 'Beau' Baumanis
Senior Application Developer
PalCare Pty. Ltd.

E: [EMAIL PROTECTED]
T: +61 -3 9381 4567
M: +61 -4 38 545 586
W: http://www.palcare.com.au

--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql