Sheng Y. Cheng wrote: > The Session 4.2.7. Aggregate Expressions in 8.3 document at > http://www.postgresql.org/docs/8.3/static/sql-expressions.html states "The > last form invokes the aggregate once for each input row regardless of null > or non-null values." I am wondering if the result I saw from 8.4.0 is a bug > fix for 8.3.1?
Well, a COUNT(ts.*) is in fact not of the last form, but the first. "ts.*" is a whole-row reference to t2, like just "ts" would be (as in "COUNT(t2)"). But there indeed seems to be something wrong. The query can be reduced into: SELECT t1.f1, COUNT(ts) FROM t1 LEFT JOIN (SELECT f1 As f1 FROM t2 OFFSET 0) AS ts ON t1.f1 = ts.f1 GROUP BY t1.f1; With this you can reproduce the discrepancy in CVS HEAD alone - the query produces a different result if you remove the "OFFSET 0": postgres=# SELECT t1.f1, COUNT(ts) FROM t1 postgres-# LEFT JOIN postgres-# (SELECT f1 As f1 FROM t2 OFFSET 0) AS ts postgres-# ON t1.f1 = ts.f1 postgres-# GROUP BY t1.f1; f1 | count -----+------- aaa | 0 bbb | 1 ccc | 0 (3 rows) postgres=# SELECT t1.f1, COUNT(ts) FROM t1 LEFT JOIN (SELECT f1 As f1 FROM t2 /* OFFSET 0 */) AS ts ON t1.f1 = ts.f1 GROUP BY t1.f1; f1 | count -----+------- aaa | 1 bbb | 1 ccc | 1 (3 rows) Without the OFFSET, the subquery is "pulled up" into the top query, and that optimization makes the difference. PostgreSQL 8.4 is more aggressive at that optimization, which is why you saw different results on 8.3 and 8.4. The "pullup" code transforms the "ts" reference into a ROW constructor: postgres=# explain verbose SELECT t1.f1, COUNT(ts) FROM t1 LEFT JOIN (SELECT f1 As f1 FROM t2 /* OFFSET 0 */) AS ts ON t1.f1 = ts.f1 GROUP BY t1.f1; QUERY PLAN -------------------------------------------------------------------------------- GroupAggregate (cost=181.86..362.51 rows=200 width=64) Output: t1.f1, count(ROW(t2.f1)) ... That transformation isn't 100% accurate. A ROW expression with all NULL columns is not the same thing as a NULL whole-row expression when it comes to STRICT functions - a strict function is invoked with the former, but not the latter, even though both return true for an IS NULL test. That let's us write the test case as: CREATE FUNCTION stricttest (a anyelement) RETURNS boolean AS $$ SELECT true; $$ LANGUAGE SQL STRICT; postgres=# SELECT t1.f1, stricttest(ts) FROM t1 LEFT JOIN (SELECT f1 As f1 FROM t2 OFFSET 0) AS ts ON t1.f1 = ts.f1; f1 | stricttest -----+------------ aaa | bbb | t ccc | (3 rows) postgres=# SELECT t1.f1, stricttest(ts) FROM t1 LEFT JOIN (SELECT f1 As f1 FROM t2 /* OFFSET 0 */) AS ts ON t1.f1 = ts.f1; f1 | stricttest -----+------------ aaa | t bbb | t ccc | t (3 rows) I can see two possible interpretations for this: 1. The subquery pull-up code is broken, the transformation of a whole-row reference to ROW(...) is not valid. 2. The semantics of STRICT with row arguments is broken. It should be made consistent with IS NULL. Strict function should not be called if the argument is a row value with all NULL columns. I'm not sure which interpretation is correct. Thoughts? The SQL spec probably has something to say about this. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs