[SQL] select within aggregate?
Hi!! Of course the following is possible in various ways but i wonder if there is an elegant and simple solution: A part from apache log: remote_hostvarchar(50) request_urivarchar(50) ts timestamp with time zone Now i want to see which host has done how many requests. This is not difficult: SELECT remote_host, count(*) FROM apache_log GROUP BY remote_host; But now i would like to add a column to the result which shows the most recent requested uri for each group. This sounds quite easy at first glance but i see no simple solution. I suppose with max(ts) i could acquire the time of the most recent request within the group, but how do i get the corresponding request_uri without doing subquerys or something like that? Thank you very much! Klaus ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] select within aggregate?
On Fri, 6 May 2005 13:34:26 +0530 "Ramakrishnan Muralidharan" <[EMAIL PROTECTED]> wrote: > SELECT abc.remote_host , c , abc.request_uri , a.t FROM abc , ( select > remote_host , count(*) as c , max( ts ) as t from abc group by remote_host ) > as a > where a.remote_host = abc.remote_host and abc.ts = a.t I thought about this too. But what happens if there are accidentally two sets with same timestamp in the table? I suppose the outer select would match both of them. But a DISTINCT should help to avoid this. Ok, if you think this is the propper way to meet the requirement i will do so. Thank you for your answer! Klaus ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] select within aggregate?
On Fri, 6 May 2005 07:49:23 -0500 Bruno Wolff III <[EMAIL PROTECTED]> wrote: > You could use max(ts) as long as you can have two requests occur at the > same time. This might be a reasonable assumption, but it might have been > better to include a sequence so that each record would be guaranteed to > have a unique key. Yes, of course. So far it was more a theoretical question. > Using DISTINCT ON to get the latest URIs is probably better. You can > join that query to the one doing the count. This will probably be more > efficient than executing a subquery for each remote host. The DISTINCT ON construction looks quite interesting. It seems to allow some sorting within an group (even althought it's no real aggregate) in general. I can't say if it's really more eficient than the solution suggested by R.Muralidharan. But i will do some tests! Thank you very much! Klaus ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
