I have reworked the case of BUG #17842 to include the data and the questions for further investigation.
qualstest_data contais the data export with --insert (to test it on other DB systems) qualstest_query contains the failing query and a short introduction to the data. The problem is NOT to correct the query to a working case, but to show a fundamental problem with qual pushdown. On pg16b1 (same on 15.3) the explain of the second query produces: qualstest=# qualstest=# qualstest=# explain -- select * from ( -- select count(*) from ( -- select length(sel) from ( qualstest-# select * from ( qualstest(# select qualstest(# onum qualstest(# ,vname qualstest(# ,vlen qualstest(# ,nlen qualstest(# ,olen qualstest(# ,NULLIF(vlen-olen,0) as delta_len qualstest(# from ( qualstest(# select * qualstest(# ,('0'||split_part(split_part(nline,'(',2),')',1))::smallint as nlen qualstest(# ,('0'||split_part(split_part(oline,'(',2),')',1))::smallint as olen qualstest(# from newcol qualstest(# join oldcol on onum=nnum qualstest(# join ( qualstest(# select qualstest(# vnum qualstest(# ,split_part(vline,' ',1) as vname qualstest(# ,('0'||split_part(split_part(vline,'(',2),')',1))::smallint as vlen qualstest(# from varcol qualstest(# ) qv on nline like '%'||vname||'%' qualstest(# where nline not like '%KEY%' qualstest(# ) qj qualstest(# --limit 30 qualstest(# where vlen!=olen qualstest(# ) qcomp qualstest-# where qualstest-# nlen > 0 qualstest-# ; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=90.37..10257.60 rows=2188 width=44) Hash Cond: (newcol.nnum = oldcol.onum) Join Filter: ((('0'::text || split_part(split_part(varcol.vline, '('::text, 2), ')'::text, 1)))::smallint <> (('0'::text || split_part(split_part(oldcol.oline, '('::text, 2), ')'::text, 1)))::smallint) -> Nested Loop (cost=0.00..10008.26 rows=2199 width=73) Join Filter: (newcol.nline ~~ (('%'::text || split_part(varcol.vline, ' '::text, 1)) || '%'::text)) -> Seq Scan on newcol (cost=0.00..98.23 rows=738 width=36) Filter: ((nline !~~ '%KEY%'::text) AND ((('0'::text || split_part(split_part(nline, '('::text, 2), ')'::text, 1)))::smallint > 0)) -> Materialize (cost=0.00..14.94 rows=596 width=37) -> Seq Scan on varcol (cost=0.00..11.96 rows=596 width=37) -> Hash (cost=60.72..60.72 rows=2372 width=44) -> Seq Scan on oldcol (cost=0.00..60.72 rows=2372 width=44) (11 Zeilen) qualstest=# qualstest=# select version (); version --------------------------------------------------------------- PostgreSQL 16beta1, compiled by Visual C++ build 1934, 64-bit (1 Zeile) on execution: FEHLER: ungültige Eingabesyntax für Typ smallint: »08,2« ANALYSIS: The join conditions matches all rows from oldcol and newcol, which then are filtered by inner join with only the varchar columns from varcol. Therefore the lines ,('0'||split_part(split_part(nline,'(',2),')',1))::smallint as nlen ,('0'||split_part(split_part(oline,'(',2),')',1))::smallint as olen should be applied only to filtered results known to have smallint values between the parentesis in varchar definitions. This is done correctly in the first (full) query without the final where clause. When the where nlen > 0 comes into play, the plan is changed and the filter qual is applied to all lines. There are other lines where the cast is not possible and the query fails with above error. The fundamental problem is that quals should not be pushed down for tuples not in the result set, when the operation classes of these quals could error out. Some operator classes have no runtime errors (like cast from smallint to int), but when such an error is possible, they should not be applied to tuples not part of the joined result set! I stumbled over the error by adding this harmless where clause (where nlen > 0) to a just working query and got the error. Other where-clauses (where nnum < 100) cause the same error. Operator classes which could error out should not be applied for filtering columns from relations, which are not the outermost relation in joins and could be eliminated by another join. These queries are syntactically and semantically correct but the postgre implementations causes them to error out. This is very surprising for the SQL User! The problem seems to exist also in certain backbranches. Hans Buschmann
<<attachment: qualstest.zip>>