Thanks for the feedback, Tom. > Tom Lane <t...@sss.pgh.pa.us> writes: > [...] > As far as I can see, the spec flat-out forbids this. In SQL:2021, > it's discussed in 7.17 <query expression> syntax rule 3) j) ix), which > defines [linear recursion]
(Aside: We don't have a copy of the SQL:2021 specification here (all we've got here is the SQL:2016 variant). We've browsed the ISO site and didn't find find SQL:2021 there. Is that a generally available draft document?) > So the problem with extending the spec here is that someday they might > extend it with some other semantics, and then we're between a rock and > a hard place. There are two issues, say [LIN] and [NON-LIN], here. Re [LIN]: [LIN] PostgreSQL does not allow multiple references to the recursive common table, even if the recursion is LINEAR. Plenty of examples of such queries are found in the documentation of established RDBMSs, among these IBM Db2 and MS SQL Server: - https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_73/sqlp/rbafyrecursivequeries.htm#d60691e2455 (example "Two tables used for recursion using recursive common table expressions") - https://docs.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-sql?view=sql-server-ver15#h-using-multiple-anchor-and-recursive-members (example "Using multiple anchor and recursive members") Db2 and SQL Server process such linear queries with multiple recursive CTE references just fine. As does SQLite3. With the proposed patch, PostgreSQL would be able to process these, too (and deliver the same expected result). > If there were really compelling reasons why (a) we have to have this > feature and (b) there's only one reasonable way for it to act, hence > no likelihood that the SQL committee will choose something else, then > I'd be willing to think about getting out front of the committee. > But you haven't made either case. > [...] > Do they all act the same? Has anybody that sits on the SQL committee > done it? (If you could point to DB2, in particular, I'd have some > confidence that the committee might standardize on their approach.) We'd classify the ability of established RDBMSs to cope with the just mentioned class of queries (and PostgreSQL's inability to do the same) as one compelling reason. Also, the existing functionality in Db2 and SQL Server would be a yardstick for the expected behavior. > A totally independent question is whether you've even defined a > well-defined behavior. There's an interesting comment in the spec: > > > NOTE 310 — The restrictions insure that each WLEi, viewed as a > transformation of the query names of the stratum, is monotonically > increasing. According to Tarski’s fixed point theorem, this insures that > there is a fixed point. The General Rules use Kleene’s fixed point > theorem to define a sequence that converges to the minimal fixed > point. > > > I don't know any of the underlying math here, but if you've got a > join of multiple recursion references then the number of output > rows could certainly be nonlinear, which very possibly breaks the > argument that there's a well-defined interpretation. This brings us to [NON-LIN]: [NON-LIN] NOTE 310 refers to Tarski's fixed point theorem and the prerequisite that the recursive CTE defines a MONOTONIC query (this guarantees the existence of a least fixed point which defines the meaning of a recursive CTE in the first place). MONOTONICITY and NON-LINEAR recursion, however, are two separate/orthogonal issues. A linear recursive query can be monotonic or non-monotonic (and PostgreSQL currently has a system of safeguards that aim to identify the latter kind of problematic queries, ruling out the use of EXCEPT, aggregation, and so forth), just like a non-linear query can. A mononotic non-linear recursive query approaches a least fixed point which makes the behavior of a non-linear recursive CTE as well-defined as a linear CTE. In fact, the document that led to the inclusion of WITH RECURSIVE into the SQL standard (see reference [1] below) mentions that "Linear recursion is a special case of non-linear recursion" (Section 3.9) in the fixpoint-based semantics. (It appears that the authors aimed to introduce non-linear WITH RECURSIVE from the start but the existing RECURSIVE UNION proposal at the time was restricted to linear recursion; so they paddled back and suggested to include non-linear recursion in a future SQL standard update, coined SQL4 back then). We do agree, though, that the absence of non-linear recursion in the SQL standard has openened the field for varied interpretations of the semantics. MariaDB, for example, admits non-linear recursion once you set the configuration option standard_compliant_cte to 0. However, a recursive table reference then returns *all* rows collected so far (not just those rows produced by the last iteration). This implicit change of behavior makes sense for use cases of non-linear recursion, yet may come as a surprise for query authors. Since this implicit change could alternatively be made explicit (and thus, arguably, clearer) in terms of a UNION with the recursive table reference, we'd argue to keep the semantics of recursive table references as is. But before you know, you end up with diverging semantics for a single SQL construct, just as you said above. Given this, we would propose the patch as a means to allow multiple recursive references for linear queries (see [LIN] above). This allows PostgreSQL to catch up with Db2, SQL Server, or SQLite3. The semantics in the [LIN] case are clear. Best wishes, --Denis and Torsten [1] S.J. Finkelstein, N. Mattos, I.S. Mumick, H. Pirahesh, "Expressing Recursive Queries in SQL,"" ANSI Document X3H2-96-075r1, 1996, https://www.kirusa.com/mumick/pspapers/ansiRevisedRecProp96-075r1.ps.Z