Tom, > In the end, the only reasonable way to handle this kind of thing is > to teach the query planner about it. Considering the small number > of cases that are usefully optimizable (basically only MIN and MAX > on a single table without any WHERE or GROUP clauses), and the ready > availability of a SQL-level workaround, it strikes me as a very > low-priority TODO item.
Low priority for you, Tom. For some of us, it's one of the three most high-priority "bugs" in PostgreSQL. I constantly try to sell my clients, and potential clients, on PostgreSQL. And the two things that trip me up the most frequently are lack of replication and our dog-slow aggregates. I can usually sell Postgres on our strong points, but the aggregate issue is *always* a problem. And the "slow aggregate" problem comes up about twice a week on Performance and three times a week on SQL. Regardless of the technical reason, among MSSQL, Oracle, MySQL and PostgreSQL, we have the slowest performing simple aggregates. It's very well to explain this is due to our system of extensible aggregates, but if a potential Postgres developer doesn't want to create custom aggregates, but does want to use MIN() in a correlated subquery, then they will go to a different RDBMS. As I said before, I'm absolutely thrilled that you came up with a solution for COUNT(*) ... GROUP BY queries through Hash Aggregates. That's half the picture, now we need a way to speed up MIN() and MAX() for simple one-column expressions. While there is a "workaround" using ORDER BY & LIMIT, this doesn't work for correlated subqueries or if one wants to evaluate the result of MAX() in the query. For example, the following query is not possible to "workaround" in PostgreSQL: select teams_desc.team_id, team_name, team_code, notes, min(teams_tree.treeno) as lnode, max(teams_tree.treeno) as rnode, parent.team_id as parent_id, count(*)/2 as tlevel from teams_desc JOIN teams_tree USING (team_id) join teams_tree parent ON parent.treeno < teams_tree.treeno join teams_tree parents on parents.treeno < teams_tree.treeno WHERE parent.treeno = (SELECT max(p1.treeno) from teams_tree p1 where p1.treeno < teams_tree.treeno and exists (select treeno from teams_tree p2 where p2.treeno > teams_tree.treeno and p2.team_id = p1.team_id)) AND EXISTS (select parents2.team_id from teams_tree parents2 where parents2.treeno > teams_tree.treeno AND parents2.team_id = parents.team_id) group by teams_desc.team_id, team_name, team_code, notes, parent.team_id; While one would hardly expect the above query to be fast, it is dissapointing that it takes about 8-10 times as long to execute on PostgreSQL as on MSSQL, since MSSQL seems to be able to use indexes to evaluate all three MIN() and MAX() expressions. Further, assigning such a common query function to a Postgres-specific workaround hardly upholds our project's dedication to standards. The fact that we are telling new users to use non-SQL-compliant code to do a query type present in 90% of databases bothers me every single time I give a newbie that advice. It still seems to me that if a query's WHERE expression can be evaluated using an index, then any related MIN() or MAX() expression should be evaluable using an index. That is, if you are selecting: SELECT MAX(team_id) FROM teams WHERE team_id BETWEEN 100 and 200; ... with an index on team_id then this entire query should be able to return trough an index scan. We've discussed the particular planner problems this presents for PostgreSQL, but I still believe that these are solvable ... and moreover, that we *need* to solve them if we're going to be competitive with other SQL RDBMSes. I do realize that it's my job to find something to do about this issue since I'm the one so worked up about it. What I'm concerned about is the possibility of having any idea or fix I come up with dismissed out of hand because it's a "low-priority todo". Please add up the questions and complaints of the users on SQL, NOVICE, and PERFORMANCE ... I know you read them. Thanks for reading, Tom. -- Josh Berkus Aglio Database Solutions San Francisco ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster