Andrew McMillan wrote: > > mlw wrote: > > > > Take these queries: > > > > select * from foo as F, (select * from bar where name = 'bla') as B where > > F.name = B.name > > union all > > select * from foo as F, (select * from bar where name = 'bla') as B where > > F.type = B.type > > > > OR > > > > create temp table B as select * from bar where name = 'bla'; > > select * from foo as F, B where F.name = B.name > > union all > > select * from foo as F, B where F.type = B.type; > > drop table B; > > > > My question is, which would be more efficient, or is it a wash? > > (A note, I will be calling this from an external programming laguage, PHP, so > > the first query would be one Postgres pq_exec call, while the second query > > would be three separate calls.) > > The second could also be done as a single PHP call, given that you should be able to > "create temp table ...; select ..." in a single pg_Exec call. > > You don't need a 'DROP TABLE B' because it's a temp table and will be dropped > anyway, won't it, unless you're using pg_pconnect. For a high volume website, where processing is done and latency are important considerations. Suppose, you have a few apache/php systems load balanced on top of a single database system. (This is a very standard configuration.) The apache/php machine cycles are cheaper than the database machine cycles because they can usually be scaled easily. A database system is very difficult to scale. While adding an apache/php box to this configuration is usually trivial, setting up a database system across two or more machines is a hugely more complex problem. Then there is latency, the longer a web pages takes to process, it holds its resources longer, this means you will probably have more web server processes, and if each process holds a database connection, you will probably have more open database connections. So, latency costs you the ram on the local web server and the resources on the back-end application service machines. So the real trick to getting good salability is to reduce latency AND move as much processing to the boxes which can be scaled. Persistent connections to a database are vital in this scenario. Creation of a new connection to a database impacts backend processing time and page latency. So one has to drop the temporary table. So, which is more expensive? Issuing the subquery multiple times within the larger query, or creating a temporary table, performing the simpler query, and then dropping the temp table. My guess would be that creating the temp table and dropping it again do use backend processing cycles. I wonder if PostgreSQL would be smart enough to perform that query only once? -- 5-4-3-2-1 Thunderbirds are GO! ------------------------ http://www.mohawksoft.com ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster