I see at least three different techniques, each of which performs better for different datasets. First, let's formalize our environment:
create table theTable (family int, member int, score int); insert into theTable values (1,1,10), (1,2,15), (1,3,12), (1,4,17); insert into theTable values (2,1,5),(2,2,7),(2,3,9),(2,4,10); insert into theTable values (3,1,4),(3,2,8),(3,3,2); You'd like to effectively select values based on top "nth order statistic", or the nth biggest number, which is equivalent to MAX() for n==0 (or n==1, if you think one-based). Unfortunately, I don't believe MySQL has a general "nth order statistic" operation. What's more, you want to join the selected order statistic back to its original data row, which MySQL can't do natively even with MAX(). The first technique is the one that's been suggested: Run a simply query for each family and limit yourself to two results: select * from theTable where family=1 order by score desc limit 2; select * from theTable where family=2 order by score desc limit 2; select * from theTable where family=3 order by score desc limit 2; This is efficient only if you've got a very small number of families and are willing to code a loop into your interface language. Alternatively, you can compute your order statistic explicitly by joining the table against itself, and then filter based on that: select max(curEntry.family) as family,max(curEntry.member) as member,max(curEntry.score),count(*) as orderstatistic from theTable as curEntry, theTable as greaterEntries where curEntry.family = greaterEntries.family and curEntry.score <= greaterEntries.score group by curEntry.family, curEntry.member having orderstatistic < 3 order by family, orderstatistic; This manages to pack to whole logic into one query, and is very easy to change to compute the top three, top four, etc. with no additional code. Unfortunately, MySQL is unlikely to be able to optimize this to look at less than the entire table regardless of what indices you provide. (A 'having' clause usually implies that this is the case.) If you have a very large number of members per family, this is clearly inefficient since you should only really need to analyze O(number of families) rows in order to compute your result. Finally, you can use temporary tables to compute intermediary results, with each step making use of MAX() in the hope that MySQL is able to optimize MAX() by using an index. I'm not sure if MySQL does currently optimize this or not (please post the answer, if anyone knows), but it's definitely doable in theory if you've got the right indices lying around. The SQL here would be along the lines of: create temporary table topScores select max(family) as family, max(score) as score from theTable group by theTable.family; create temporary table topMembers select max(theTable.family) as family, max(theTable.member) as member, max(theTable.score) as score from theTable, topScores where theTable.family = topScores.family and theTable.score = topScores.score group by theTable.family; create temporary table sndScores select max(theTable.family) as family, max(theTable.score) as score from theTable,topMembers where theTable.family = topMembers.family and theTable.member != topMembers.member group by theTable.family; create temporary table sndMembers select max(theTable.family) as family, max(theTable.member) as member, max(theTable.score) as score from theTable,sndScores,topMembers where theTable.family = sndScores.family and theTable.score = sndScores.score and sndScores.family = topMembers.family and theTable.member != topMembers.member group by theTable.family; insert into topMembers select * from sndMembers; select * from topMembers order by family,score DESC; drop table sndMembers; drop table sndScores; drop table topMembers; drop table topScores; The last temporary table, at the least, can be turned into a simple insert in our limited example, but I've provided it as a distinct table for clarity. This technique is certainly much more complex, but for very large numbers of members per family it might be more efficient, and it still requires no logic in the interface language: there is only one command which returns a result set for processing, and that is the final result. This approach does require additional logic to extend to more than the top two scores, which requires another pair of SQL statements per result. If you are looking for order statistics on the order of the average number of members per family, then the single-statement approach will almost certainly be a better choice. -Rob (because I'm bored at work, that's why...) On 22/5/02 at 10:36 am, R.Dobson <[EMAIL PROTECTED]> wrote: > Hi, > thanks for all of the replies to my query. > I'm not sure that I explained my problem very well as the solutions > received are solutions to the problem I described, but not the one I > meant :-) (I don't think anyway) > > > I have a table in the format: > > family | member | score > -------------------------------- > 1 | 1 | 10 > 1 | 2 | 15 > 1 | 3 | 12 > 1 | 4 | 17 > 2 | 1 | 5 > 2 | 2 | 7 > 2 | 3 | 9 > 2 | 4 | 10 > 3 | 1 | 4 > 3 | 2 | 8 > 3 | 3 | 2 > > I want the top 2 highest scorers for each family as in: > > family | member | score > ------------------------------- > 1 | 4 | 17 > 1 | 2 | 15 > 2 | 4 | 10 > 2 | 3 | 9 > 3 | 2 | 8 > 3 | 1 | 4 > > > Thanks again, > Rich > > mysql --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php