> What this patch does is to add two new clauses for FROM-list items, > NORMALIZE and ALIGN, which reshuffle a set of ranges into a new list > that can then be aggregated more easily. From the original message: > > > For NORMALIZE the tuples' ranges need to be split into all sub-ranges > > according to all matching ranges of the second relation. For this we > > create a subquery that first joins one relation with the range > > boundaries of the other and then sorts the result. The executor > > function splits the ranges in a sweep-line based manner. > > > > For ALIGN the tuples' ranges must be split into all intersections and > > differences with the other relation according to the join condition. > > For this we create a subquery that first joins the two relations and > > then sorts the result. The executor function splits the ranges > > accordingly in a sweep-line based manner. > > So there isn't really temporal query processing as such here, only some > helpers that can make it easier.
The goal of temporal aligners and normalizers is to split ranges to allow a reduction from temporal queries to their non-temporal counterparts. Splitting ranges is necessary for temporal query processing. Temporal aligners and normalizer may then be used as building-blocks for any temporal query construct. > I can see how those operations can be useful, but it would help if there > were a more formal definition to be able to check that further. We have published two papers, that contain formal definitions and related work for the temporal aligner and normalizer. Please see [1] and [2]. > What I'm missing here is some references: existing implementations, > standards, documentation, research papers, alternative ideas, rejected > alternatives, etc. A good overview of existing implementations in DBMSs, SQL standard, and history is given in [3]. > Also, the submission is missing documentation and test cases. There are > technical terms used in the code that I don't understand. We added a second patch with test cases and expected results. We are now writing the documentation in sgml-format. > I think there are probably many interesting applications for normalizing > or otherwise adjusting ranges. I'd like to see an overview and > consideration of other applications. Please see the attached file adjustment.sql for some interesting applications. > Ideally, I'd like to see these things implemented as some kind of > user-space construct, like an operator or function. I think we'd need a > clearer definition of what it is they do before we can evaluate that. Can you please explain what you mean by "user-space construct" in this case. Best regards, Anton, Johann, Michael, Peter ---- [1] Anton Dignös, Michael H. Böhlen, Johann Gamper: Temporal alignment. SIGMOD Conference 2012: 433-444 http://doi.acm.org/10.1145/2213836.2213886 [2] Anton Dignös, Michael H. Böhlen, Johann Gamper, Christian S. Jensen: Extending the Kernel of a Relational DBMS with Comprehensive Support for Sequenced Temporal Queries. ACM Trans. Database Syst. 41(4): 26:1-26:46 (2016) http://doi.acm.org/10.1145/2967608 [3] https://www2.cs.arizona.edu/people/rts/sql3.html and https://www2.cs.arizona.edu/people/rts/tsql2.html
adjustment.sql
Description: application/sql
diff --git src/test/regress/expected/temporal_primitives.out src/test/regress/expected/temporal_primitives.out new file mode 100644 index 0000000..6e4cc0d --- /dev/null +++ src/test/regress/expected/temporal_primitives.out @@ -0,0 +1,841 @@ +-- +-- TEMPORAL PRIMITIVES +-- +SET datestyle TO ymd; +CREATE COLLATION "de_DE.utf8" (LC_COLLATE = "de_DE.utf8", + LC_CTYPE = "de_DE.utf8" ); +CREATE TEMP TABLE tpg_table1 (a char, b char, ts int, te int); +CREATE TEMP TABLE tpg_table2 (c int, d char, ts int, te int); +INSERT INTO tpg_table1 VALUES +('a','B',1,7), +('b','B',3,9), +('c','G',8,10); +INSERT INTO tpg_table2 VALUES +(1,'B',2,5), +(2,'B',3,4), +(3,'B',7,9); +-- VALID TIME columns (i.e., ts and te) are no longer at the end of the +-- targetlist. +CREATE TEMP TABLE tpg_table3 AS + SELECT a, ts, te, b FROM tpg_table1; +CREATE TEMP TABLE tpg_table4 AS + SELECT c, ts, d, te FROM tpg_table2; +-- VALID TIME columns represented as range type +CREATE TEMP TABLE tpg_table5 AS + SELECT int4range(ts, te) t, a, b FROM tpg_table1; +CREATE TEMP TABLE tpg_table6 AS + SELECT int4range(ts, te) t, c a, d b FROM tpg_table2; +-- VALID TIME columns as VARCHARs +CREATE TEMP TABLE tpg_table7 (a int, ts varchar, te varchar); +CREATE TEMP TABLE tpg_table8 (a int, + ts varchar COLLATE "de_DE.utf8", + te varchar COLLATE "POSIX"); +INSERT INTO tpg_table7 VALUES +(0, 'A', 'D'), +(1, 'C', 'X'), +(0, 'ABC', 'BCD'), +(0, 'xABC', 'xBCD'), +(0, 'BAA', 'BBB'); +INSERT INTO tpg_table8 VALUES +(0, 'A', 'D'), +(1, 'C', 'X'); +-- Tables to check different data types, and corner cases +CREATE TEMP TABLE tpg_table9 (a int, ts timestamp, te timestamp); +CREATE TEMP TABLE tpg_table10 (a int, ts double precision, te double precision); +CREATE TEMP TABLE tpg_table11 AS TABLE tpg_table10; +INSERT INTO tpg_table9 VALUES +(0, '2000-01-01', '2000-01-10'), +(1, '2000-01-05', '2000-01-20'); +INSERT INTO tpg_table10 VALUES +(0, 1.0, 1.1111), +(1, 1.11109999, 2.0); +INSERT INTO tpg_table11 VALUES +(0, 1.0, 'Infinity'), +(1, '-Infinity', 2.0); +-- +-- TEMPORAL ALIGNER: BASICS +-- +-- Equality qualifiers +SELECT * FROM ( + tpg_table1 ALIGN tpg_table2 + ON b = d + WITH (ts, te, ts, te) + ) x; + a | b | ts | te +---+---+----+---- + a | B | 1 | 2 + a | B | 2 | 5 + a | B | 3 | 4 + a | B | 5 | 7 + b | B | 3 | 4 + b | B | 3 | 5 + b | B | 5 | 7 + b | B | 7 | 9 + c | G | 8 | 10 +(9 rows) + +-- Equality qualifiers with FQN inside ON- and WITH-clause +SELECT * FROM ( + tpg_table1 ALIGN tpg_table2 + ON tpg_table1.b = tpg_table2.d + WITH (tpg_table1.ts, tpg_table1.te, tpg_table2.ts, tpg_table2.te) + ) x; + a | b | ts | te +---+---+----+---- + a | B | 1 | 2 + a | B | 2 | 5 + a | B | 3 | 4 + a | B | 5 | 7 + b | B | 3 | 4 + b | B | 3 | 5 + b | B | 5 | 7 + b | B | 7 | 9 + c | G | 8 | 10 +(9 rows) + +-- Alignment with aggregation +-- NB: Targetlist of outer query is *not* A_STAR... +SELECT a, COUNT(a) FROM ( + tpg_table1 ALIGN tpg_table2 + ON b = d + WITH (ts, te, ts, te) + ) x + GROUP BY a ORDER BY a; + a | count +---+------- + a | 4 + b | 4 + c | 1 +(3 rows) + +-- Equality qualifiers +-- Test column positions where ts and te are not the last two columns. +-- Please note: This was a restriction in an early implementation. +SELECT * FROM ( + tpg_table3 ALIGN tpg_table4 + ON b = d + WITH (ts, te, ts, te) + ) x; + a | ts | te | b +---+----+----+--- + a | 1 | 2 | B + a | 2 | 5 | B + a | 3 | 4 | B + a | 5 | 7 | B + b | 3 | 4 | B + b | 3 | 5 | B + b | 5 | 7 | B + b | 7 | 9 | B + c | 8 | 10 | G +(9 rows) + +-- Equality qualifiers with FQN inside ON- and WITH-clause +-- Test column positions where ts and te are not the last two columns. +-- Please note: This was a restriction in an early implementation. +SELECT * FROM ( + tpg_table3 ALIGN tpg_table4 + ON tpg_table3.b = tpg_table4.d + WITH (tpg_table3.ts, tpg_table3.te, tpg_table4.ts, tpg_table4.te) + ) x; + a | ts | te | b +---+----+----+--- + a | 1 | 2 | B + a | 2 | 5 | B + a | 3 | 4 | B + a | 5 | 7 | B + b | 3 | 4 | B + b | 3 | 5 | B + b | 5 | 7 | B + b | 7 | 9 | B + c | 8 | 10 | G +(9 rows) + +-- Alignment with aggregation where targetlist of outer query is *not* A_STAR... +-- Test column positions where ts and te are not the last two columns. +-- Please note: This was a restriction in an early implementation. +SELECT a, COUNT(a) FROM ( + tpg_table3 ALIGN tpg_table4 + ON b = d + WITH (ts, te, ts, te) + ) x + GROUP BY a ORDER BY a; + a | count +---+------- + a | 4 + b | 4 + c | 1 +(3 rows) + +-- Test relations with differently named temporal bound attributes and relation +-- and column aliases. +SELECT * FROM ( + tpg_table1 ALIGN tpg_table2 x(c,d,s,e) + ON b = d + WITH (ts, te, s, e) + ) x; + a | b | ts | te +---+---+----+---- + a | B | 1 | 2 + a | B | 2 | 5 + a | B | 3 | 4 + a | B | 5 | 7 + b | B | 3 | 4 + b | B | 3 | 5 + b | B | 5 | 7 + b | B | 7 | 9 + c | G | 8 | 10 +(9 rows) + +-- Range types for temporal boundaries, i.e., valid time columns +SELECT * FROM ( + tpg_table5 ALIGN tpg_table6 + ON TRUE + WITH (t, t) + ) x; + t | a | b +--------+---+--- + [1,2) | a | B + [2,5) | a | B + [3,4) | a | B + [5,7) | a | B + [3,4) | b | B + [3,5) | b | B + [5,7) | b | B + [7,9) | b | B + [8,9) | c | G + [9,10) | c | G +(10 rows) + +-- +-- TEMPORAL ALIGNER: TEMPORAL JOIN EXAMPLE +-- +-- Full temporal join example with absorbing where clause, timestamp +-- propagation (see CTEs targetlists with V and U) and range types +WITH t1 AS (SELECT *, t u FROM tpg_table5), + t2 AS (SELECT *, t v FROM tpg_table6) +SELECT t, b, x.a, y.a FROM ( + t1 ALIGN t2 + ON t1.b = t2.b + WITH (t, t) + ) x + LEFT OUTER JOIN ( + SELECT * FROM ( + t2 ALIGN t1 + ON t1.b = t2.b + WITH (t, t) + ) y + ) y + USING (b, t) + WHERE ( + (lower(t) = lower(u) OR lower(t) = lower(v)) + AND + (upper(t) = upper(u) OR upper(t) = upper(v)) + ) + OR u IS NULL + OR v IS NULL + ORDER BY 1,2,3,4; + t | b | a | a +--------+---+---+--- + [1,2) | B | a | + [2,5) | B | a | 1 + [3,4) | B | a | 2 + [3,4) | B | b | 2 + [3,5) | B | b | 1 + [5,7) | B | a | + [5,7) | B | b | + [7,9) | B | b | 3 + [8,10) | G | c | +(9 rows) + +-- Full temporal join example with absorbing where clause, timestamp +-- propagation (see CTEs targetlists with V and U) and scalar VALID TIME columns +WITH t1 AS (SELECT *, ts us, te ue FROM tpg_table1), + t2 AS (SELECT *, ts vs, te ve FROM tpg_table2) +SELECT x.ts, x.te, b, a, c FROM ( + t1 ALIGN t2 + ON b = d + WITH (ts, te, ts, te) + ) x + LEFT OUTER JOIN ( + SELECT * FROM ( + t2 ALIGN t1 + ON b = d + WITH (ts, te, ts, te) + ) y + ) y + ON b = d AND x.ts = y.ts AND x.te = y.te + WHERE ( + (x.ts = us OR x.ts = vs) + AND + (x.te = ue OR x.te = ve) + ) + OR us IS NULL + OR vs IS NULL + ORDER BY 1,2,3,4; + ts | te | b | a | c +----+----+---+---+--- + 1 | 2 | B | a | + 2 | 5 | B | a | 1 + 3 | 4 | B | a | 2 + 3 | 4 | B | b | 2 + 3 | 5 | B | b | 1 + 5 | 7 | B | a | + 5 | 7 | B | b | + 7 | 9 | B | b | 3 + 8 | 10 | G | c | +(9 rows) + +-- Collation and varchar boundaries +SELECT * FROM ( + tpg_table7 x ALIGN tpg_table7 y + ON TRUE + WITH (ts, te, ts, te) + ) x; + a | ts | te +---+------+------ + 0 | A | D + 0 | ABC | BCD + 0 | BAA | BBB + 0 | C | D + 1 | C | D + 1 | C | X + 0 | ABC | BCD + 0 | BAA | BBB + 0 | xABC | xBCD + 0 | BAA | BBB +(10 rows) + +-- Collation and varchar boundaries with incompatible collations (ERROR expected) +SELECT * FROM ( + tpg_table8 x ALIGN tpg_table8 y + ON TRUE + WITH (ts, te, ts, te) + ) x; +ERROR: could not determine which collation to use for string comparison +HINT: Use the COLLATE clause to set the collation explicitly. +-- +-- TEMPORAL ALIGNER: SELECTION PUSH-DOWN +-- +-- VALID TIME columns are not safe to be pushed down, for the rest everything +-- should work as usual. +EXPLAIN (COSTS OFF) SELECT * FROM ( + tpg_table2 ALIGN tpg_table1 + ON TRUE + WITH (ts, te, ts, te) + ) x + WHERE c < 3; + QUERY PLAN +------------------------------------------------------------------------------------------------------------------------------ + Subquery Scan on x + -> Adjustment(for ALIGN) + -> Sort + Sort Key: (row_id() OVER (?)), (GREATEST(tpg_table2.ts, tpg_table1.ts)), (LEAST(tpg_table2.te, tpg_table1.te)) + -> Nested Loop Left Join + Join Filter: ((tpg_table2.ts < tpg_table1.te) AND (tpg_table2.te > tpg_table1.ts)) + -> WindowAgg + -> Seq Scan on tpg_table2 + Filter: (c < 3) + -> Materialize + -> Seq Scan on tpg_table1 +(11 rows) + +EXPLAIN (COSTS OFF) SELECT * FROM ( + tpg_table2 ALIGN tpg_table1 + ON TRUE + WITH (ts, te, ts, te) + ) x + WHERE c < 3 AND ts > 3; + QUERY PLAN +------------------------------------------------------------------------------------------------------------------------------ + Subquery Scan on x + Filter: (x.ts > 3) + -> Adjustment(for ALIGN) + -> Sort + Sort Key: (row_id() OVER (?)), (GREATEST(tpg_table2.ts, tpg_table1.ts)), (LEAST(tpg_table2.te, tpg_table1.te)) + -> Nested Loop Left Join + Join Filter: ((tpg_table2.ts < tpg_table1.te) AND (tpg_table2.te > tpg_table1.ts)) + -> WindowAgg + -> Seq Scan on tpg_table2 + Filter: (c < 3) + -> Materialize + -> Seq Scan on tpg_table1 +(12 rows) + +EXPLAIN (COSTS OFF) SELECT * FROM ( + tpg_table2 ALIGN tpg_table1 + ON TRUE + WITH (ts, te, ts, te) + ) x + WHERE c < 3 OR ts > 3; + QUERY PLAN +------------------------------------------------------------------------------------------------------------------------------ + Subquery Scan on x + Filter: ((x.c < 3) OR (x.ts > 3)) + -> Adjustment(for ALIGN) + -> Sort + Sort Key: (row_id() OVER (?)), (GREATEST(tpg_table2.ts, tpg_table1.ts)), (LEAST(tpg_table2.te, tpg_table1.te)) + -> Nested Loop Left Join + Join Filter: ((tpg_table2.ts < tpg_table1.te) AND (tpg_table2.te > tpg_table1.ts)) + -> WindowAgg + -> Seq Scan on tpg_table2 + -> Materialize + -> Seq Scan on tpg_table1 +(11 rows) + +-- +-- TEMPORAL ALIGNER: DATA TYPES +-- +-- Data types: Timestamps +-- We use to_char here to be sure that we have the same output format on all +-- platforms and locale configuration +SELECT a, to_char(ts, 'YYYY-MM-DD') ts, to_char(te, 'YYYY-MM-DD') te FROM ( + tpg_table9 t1 ALIGN tpg_table9 t2 + ON t1.a = 0 + WITH (ts, te, ts, te) + ) x; + a | ts | te +---+------------+------------ + 0 | 2000-01-01 | 2000-01-10 + 0 | 2000-01-05 | 2000-01-10 + 1 | 2000-01-05 | 2000-01-20 +(3 rows) + +-- Data types: Double precision +SELECT a, ts, te FROM ( + tpg_table10 t1 ALIGN tpg_table10 t2 + ON t1.a = 0 + WITH (ts, te, ts, te) + ) x; + a | ts | te +---+------------+-------- + 0 | 1 | 1.1111 + 0 | 1.11109999 | 1.1111 + 1 | 1.11109999 | 2 +(3 rows) + +-- Data types: Double precision with +/- infinity +SELECT a, ts, te FROM ( + tpg_table11 t1 ALIGN tpg_table11 t2 + ON t1.a = 0 + WITH (ts, te, ts, te) + ) x; + a | ts | te +---+-----------+---------- + 0 | 1 | 2 + 0 | 1 | Infinity + 1 | -Infinity | 2 +(3 rows) + +-- +-- TEMPORAL NORMALIZER: BASICS +-- +-- Equality qualifiers +SELECT * FROM ( + tpg_table1 NORMALIZE tpg_table2 + ON b = d + WITH (ts, te, ts, te) + ) x; + a | b | ts | te +---+---+----+---- + a | B | 1 | 2 + a | B | 2 | 3 + a | B | 3 | 4 + a | B | 4 | 5 + a | B | 5 | 7 + b | B | 3 | 4 + b | B | 4 | 5 + b | B | 5 | 7 + b | B | 7 | 9 + c | G | 8 | 10 +(10 rows) + +-- Equality qualifiers with FQN inside ON- and WITH-clause +SELECT * FROM ( + tpg_table1 NORMALIZE tpg_table2 + ON tpg_table1.b = tpg_table2.d + WITH (tpg_table1.ts, tpg_table1.te, tpg_table2.ts, tpg_table2.te) + ) x; + a | b | ts | te +---+---+----+---- + a | B | 1 | 2 + a | B | 2 | 3 + a | B | 3 | 4 + a | B | 4 | 5 + a | B | 5 | 7 + b | B | 3 | 4 + b | B | 4 | 5 + b | B | 5 | 7 + b | B | 7 | 9 + c | G | 8 | 10 +(10 rows) + +-- Alignment with aggregation +-- NB: Targetlist of outer query is *not* A_STAR... +SELECT a, COUNT(a) FROM ( + tpg_table1 NORMALIZE tpg_table2 + ON b = d + WITH (ts, te, ts, te) + ) x + GROUP BY a ORDER BY a; + a | count +---+------- + a | 5 + b | 4 + c | 1 +(3 rows) + +-- Equality qualifiers +-- Test column positions where ts and te are not the last two columns. +-- Please note: This was a restriction in an early implementation. +SELECT * FROM ( + tpg_table3 NORMALIZE tpg_table4 + ON b = d + WITH (ts, te, ts, te) + ) x; + a | ts | te | b +---+----+----+--- + a | 1 | 2 | B + a | 2 | 3 | B + a | 3 | 4 | B + a | 4 | 5 | B + a | 5 | 7 | B + b | 3 | 4 | B + b | 4 | 5 | B + b | 5 | 7 | B + b | 7 | 9 | B + c | 8 | 10 | G +(10 rows) + +-- Equality qualifiers with FQN inside ON- and WITH-clause +-- Test column positions where ts and te are not the last two columns. +-- Please note: This was a restriction in an early implementation. +SELECT * FROM ( + tpg_table3 NORMALIZE tpg_table4 + ON tpg_table3.b = tpg_table4.d + WITH (tpg_table3.ts, tpg_table3.te, tpg_table4.ts, tpg_table4.te) + ) x; + a | ts | te | b +---+----+----+--- + a | 1 | 2 | B + a | 2 | 3 | B + a | 3 | 4 | B + a | 4 | 5 | B + a | 5 | 7 | B + b | 3 | 4 | B + b | 4 | 5 | B + b | 5 | 7 | B + b | 7 | 9 | B + c | 8 | 10 | G +(10 rows) + +-- Alignment with aggregation where targetlist of outer query is *not* A_STAR... +-- Test column positions where ts and te are not the last two columns. +-- Please note: This was a restriction in an early implementation. +SELECT a, COUNT(a) FROM ( + tpg_table3 NORMALIZE tpg_table4 + ON b = d + WITH (ts, te, ts, te) + ) x + GROUP BY a ORDER BY a; + a | count +---+------- + a | 5 + b | 4 + c | 1 +(3 rows) + +-- Test relations with differently named temporal bound attributes and relation +-- and column aliases. +SELECT * FROM ( + tpg_table1 NORMALIZE tpg_table2 x(c,d,s,e) + ON b = d + WITH (ts, te, s, e) + ) x; + a | b | ts | te +---+---+----+---- + a | B | 1 | 2 + a | B | 2 | 3 + a | B | 3 | 4 + a | B | 4 | 5 + a | B | 5 | 7 + b | B | 3 | 4 + b | B | 4 | 5 + b | B | 5 | 7 + b | B | 7 | 9 + c | G | 8 | 10 +(10 rows) + +-- Normalizer's USING clause (self-normalization) +SELECT * FROM ( + tpg_table1 t1 NORMALIZE tpg_table1 t2 + USING (a) + WITH (ts, te, ts, te) + ) x; + a | b | ts | te +---+---+----+---- + a | B | 1 | 7 + b | B | 3 | 9 + c | G | 8 | 10 +(3 rows) + +-- Range types for temporal boundaries, i.e., valid time columns +SELECT * FROM ( + tpg_table5 NORMALIZE tpg_table6 + USING (b) + WITH (t, t) + ) x; + t | a | b +--------+---+--- + [1,2) | a | B + [2,3) | a | B + [3,4) | a | B + [4,5) | a | B + [5,7) | a | B + [3,4) | b | B + [4,5) | b | B + [5,7) | b | B + [7,9) | b | B + [8,10) | c | G +(10 rows) + +-- Collation and varchar boundaries +SELECT * FROM ( + tpg_table7 x NORMALIZE tpg_table7 y + ON TRUE + WITH (ts, te, ts, te) + ) x; + a | ts | te +---+------+------ + 0 | A | ABC + 0 | ABC | BAA + 0 | BAA | BBB + 0 | BBB | BCD + 0 | BCD | C + 0 | C | D + 1 | C | D + 1 | D | X + 0 | ABC | BAA + 0 | BAA | BBB + 0 | BBB | BCD + 0 | xABC | xBCD + 0 | BAA | BBB +(13 rows) + +-- Collation and varchar boundaries with incompatible collations (ERROR expected) +SELECT * FROM ( + tpg_table8 x NORMALIZE tpg_table8 y + ON TRUE + WITH (ts, te, ts, te) + ) x; +ERROR: could not determine which collation to use for string comparison +HINT: Use the COLLATE clause to set the collation explicitly. +-- +-- TEMPORAL NORMALIZER: SELECTION PUSH-DOWN +-- +-- VALID TIME columns are not safe to be pushed down, for the rest everything +-- should work as usual. +EXPLAIN (COSTS OFF) SELECT * FROM ( + tpg_table2 NORMALIZE tpg_table1 + ON TRUE + WITH (ts, te, ts, te) + ) x + WHERE c < 3; + QUERY PLAN +--------------------------------------------------------------------------------------------------------- + Subquery Scan on x + -> Adjustment(for NORMALIZE) + -> Sort + Sort Key: (row_id() OVER (?)), tpg_table1.ts + -> Nested Loop Left Join + Join Filter: ((tpg_table1.ts >= tpg_table2.ts) AND (tpg_table1.ts < tpg_table2.te)) + -> WindowAgg + -> Seq Scan on tpg_table2 + Filter: (c < 3) + -> Materialize + -> Append + -> Seq Scan on tpg_table1 + -> Seq Scan on tpg_table1 tpg_table1_1 +(13 rows) + +EXPLAIN (COSTS OFF) SELECT * FROM ( + tpg_table2 NORMALIZE tpg_table1 + ON TRUE + WITH (ts, te, ts, te) + ) x + WHERE c < 3 AND ts > 3; + QUERY PLAN +--------------------------------------------------------------------------------------------------------- + Subquery Scan on x + Filter: (x.ts > 3) + -> Adjustment(for NORMALIZE) + -> Sort + Sort Key: (row_id() OVER (?)), tpg_table1.ts + -> Nested Loop Left Join + Join Filter: ((tpg_table1.ts >= tpg_table2.ts) AND (tpg_table1.ts < tpg_table2.te)) + -> WindowAgg + -> Seq Scan on tpg_table2 + Filter: (c < 3) + -> Materialize + -> Append + -> Seq Scan on tpg_table1 + -> Seq Scan on tpg_table1 tpg_table1_1 +(14 rows) + +EXPLAIN (COSTS OFF) SELECT * FROM ( + tpg_table2 NORMALIZE tpg_table1 + ON TRUE + WITH (ts, te, ts, te) + ) x + WHERE c < 3 OR ts > 3; + QUERY PLAN +--------------------------------------------------------------------------------------------------------- + Subquery Scan on x + Filter: ((x.c < 3) OR (x.ts > 3)) + -> Adjustment(for NORMALIZE) + -> Sort + Sort Key: (row_id() OVER (?)), tpg_table1.ts + -> Nested Loop Left Join + Join Filter: ((tpg_table1.ts >= tpg_table2.ts) AND (tpg_table1.ts < tpg_table2.te)) + -> WindowAgg + -> Seq Scan on tpg_table2 + -> Materialize + -> Append + -> Seq Scan on tpg_table1 + -> Seq Scan on tpg_table1 tpg_table1_1 +(13 rows) + +-- +-- TEMPORAL NORMALIZER: DATA TYPES +-- +-- Data types: Timestamps +-- We use to_char here to be sure that we have the same output format on all +-- platforms and locale configuration +SELECT a, to_char(ts, 'YYYY-MM-DD') ts, to_char(te, 'YYYY-MM-DD') te FROM ( + tpg_table9 t1 NORMALIZE tpg_table9 t2 + ON t1.a = 0 + WITH (ts, te, ts, te) + ) x; + a | ts | te +---+------------+------------ + 0 | 2000-01-01 | 2000-01-05 + 0 | 2000-01-05 | 2000-01-10 + 1 | 2000-01-05 | 2000-01-20 +(3 rows) + +-- Data types: Double precision +SELECT a, ts, te FROM ( + tpg_table10 t1 NORMALIZE tpg_table10 t2 + ON t1.a = 0 + WITH (ts, te, ts, te) + ) x; + a | ts | te +---+------------+------------ + 0 | 1 | 1.11109999 + 0 | 1.11109999 | 1.1111 + 1 | 1.11109999 | 2 +(3 rows) + +-- Data types: Double precision with +/- infinity +SELECT a, ts, te FROM ( + tpg_table11 t1 NORMALIZE tpg_table11 t2 + ON t1.a = 0 + WITH (ts, te, ts, te) + ) x; + a | ts | te +---+-----------+---------- + 0 | 1 | 2 + 0 | 2 | Infinity + 1 | -Infinity | 2 +(3 rows) + +-- +-- TEMPORAL ALIGNER AND NORMALIZER: VIEWS +-- +-- Views with temporal normalization +CREATE TEMP VIEW v AS SELECT * FROM ( + tpg_table1 NORMALIZE tpg_table2 + ON b = d + WITH (ts, te, ts, te) + ) x; +TABLE v; + a | b | ts | te +---+---+----+---- + a | B | 1 | 2 + a | B | 2 | 3 + a | B | 3 | 4 + a | B | 4 | 5 + a | B | 5 | 7 + b | B | 3 | 4 + b | B | 4 | 5 + b | B | 5 | 7 + b | B | 7 | 9 + c | G | 8 | 10 +(10 rows) + +DROP VIEW v; +-- Views with temporal alignment +CREATE TEMP VIEW v AS SELECT * FROM ( + tpg_table1 ALIGN tpg_table2 + ON b = d + WITH (ts, te, ts, te) + ) x; +TABLE v; + a | b | ts | te +---+---+----+---- + a | B | 1 | 2 + a | B | 2 | 5 + a | B | 3 | 4 + a | B | 5 | 7 + b | B | 3 | 4 + b | B | 3 | 5 + b | B | 5 | 7 + b | B | 7 | 9 + c | G | 8 | 10 +(9 rows) + +DROP VIEW v; +-- Testing temporal normalization with ambiguous columns, i.e. columns that +-- are used internally... +CREATE TEMP VIEW v AS SELECT * FROM ( + tpg_table1 AS r(p1, p1_0, "p1_-1", p1_1) NORMALIZE tpg_table2 s + ON r.p1_0 = s.d + WITH ("p1_-1", p1_1, ts, te) + ) x; +TABLE v; + p1 | p1_0 | p1_-1 | p1_1 +----+------+-------+------ + a | B | 1 | 2 + a | B | 2 | 3 + a | B | 3 | 4 + a | B | 4 | 5 + a | B | 5 | 7 + b | B | 3 | 4 + b | B | 4 | 5 + b | B | 5 | 7 + b | B | 7 | 9 + c | G | 8 | 10 +(10 rows) + +DROP VIEW v; +-- Testing temporal alignment with ambiguous columns, i.e. columns that +-- are used internally... +CREATE TEMP VIEW v AS SELECT * FROM ( + tpg_table1 AS r(p1, p1_0, "p1_-1", p1_1) ALIGN tpg_table2 s + ON r.p1_0 = s.d + WITH ("p1_-1",p1_1,ts,te) + ) x; +TABLE v; + p1 | p1_0 | p1_-1 | p1_1 +----+------+-------+------ + a | B | 1 | 2 + a | B | 2 | 5 + a | B | 3 | 4 + a | B | 5 | 7 + b | B | 3 | 4 + b | B | 3 | 5 + b | B | 5 | 7 + b | B | 7 | 9 + c | G | 8 | 10 +(9 rows) + +DROP VIEW v; diff --git src/test/regress/parallel_schedule src/test/regress/parallel_schedule index 8641769..61813ef 100644 --- src/test/regress/parallel_schedule +++ src/test/regress/parallel_schedule @@ -89,7 +89,7 @@ test: brin gin gist spgist privileges init_privs security_label collate matview # ---------- # Another group of parallel tests # ---------- -test: alter_generic alter_operator misc psql async dbsize misc_functions tsrf +test: alter_generic alter_operator misc psql async dbsize misc_functions tsrf temporal_primitives # rules cannot run concurrently with any test that creates a view test: rules psql_crosstab amutils diff --git src/test/regress/sql/temporal_primitives.sql src/test/regress/sql/temporal_primitives.sql new file mode 100644 index 0000000..9b475f7 --- /dev/null +++ src/test/regress/sql/temporal_primitives.sql @@ -0,0 +1,456 @@ +-- +-- TEMPORAL PRIMITIVES +-- +SET datestyle TO ymd; + +CREATE COLLATION "de_DE.utf8" (LC_COLLATE = "de_DE.utf8", + LC_CTYPE = "de_DE.utf8" ); + +CREATE TEMP TABLE tpg_table1 (a char, b char, ts int, te int); +CREATE TEMP TABLE tpg_table2 (c int, d char, ts int, te int); + +INSERT INTO tpg_table1 VALUES +('a','B',1,7), +('b','B',3,9), +('c','G',8,10); +INSERT INTO tpg_table2 VALUES +(1,'B',2,5), +(2,'B',3,4), +(3,'B',7,9); + +-- VALID TIME columns (i.e., ts and te) are no longer at the end of the +-- targetlist. +CREATE TEMP TABLE tpg_table3 AS + SELECT a, ts, te, b FROM tpg_table1; +CREATE TEMP TABLE tpg_table4 AS + SELECT c, ts, d, te FROM tpg_table2; + +-- VALID TIME columns represented as range type +CREATE TEMP TABLE tpg_table5 AS + SELECT int4range(ts, te) t, a, b FROM tpg_table1; +CREATE TEMP TABLE tpg_table6 AS + SELECT int4range(ts, te) t, c a, d b FROM tpg_table2; + +-- VALID TIME columns as VARCHARs +CREATE TEMP TABLE tpg_table7 (a int, ts varchar, te varchar); +CREATE TEMP TABLE tpg_table8 (a int, + ts varchar COLLATE "de_DE.utf8", + te varchar COLLATE "POSIX"); + +INSERT INTO tpg_table7 VALUES +(0, 'A', 'D'), +(1, 'C', 'X'), +(0, 'ABC', 'BCD'), +(0, 'xABC', 'xBCD'), +(0, 'BAA', 'BBB'); + +INSERT INTO tpg_table8 VALUES +(0, 'A', 'D'), +(1, 'C', 'X'); + +-- Tables to check different data types, and corner cases +CREATE TEMP TABLE tpg_table9 (a int, ts timestamp, te timestamp); +CREATE TEMP TABLE tpg_table10 (a int, ts double precision, te double precision); +CREATE TEMP TABLE tpg_table11 AS TABLE tpg_table10; + +INSERT INTO tpg_table9 VALUES +(0, '2000-01-01', '2000-01-10'), +(1, '2000-01-05', '2000-01-20'); + +INSERT INTO tpg_table10 VALUES +(0, 1.0, 1.1111), +(1, 1.11109999, 2.0); + +INSERT INTO tpg_table11 VALUES +(0, 1.0, 'Infinity'), +(1, '-Infinity', 2.0); + + +-- +-- TEMPORAL ALIGNER: BASICS +-- + +-- Equality qualifiers +SELECT * FROM ( + tpg_table1 ALIGN tpg_table2 + ON b = d + WITH (ts, te, ts, te) + ) x; + +-- Equality qualifiers with FQN inside ON- and WITH-clause +SELECT * FROM ( + tpg_table1 ALIGN tpg_table2 + ON tpg_table1.b = tpg_table2.d + WITH (tpg_table1.ts, tpg_table1.te, tpg_table2.ts, tpg_table2.te) + ) x; + +-- Alignment with aggregation +-- NB: Targetlist of outer query is *not* A_STAR... +SELECT a, COUNT(a) FROM ( + tpg_table1 ALIGN tpg_table2 + ON b = d + WITH (ts, te, ts, te) + ) x + GROUP BY a ORDER BY a; + +-- Equality qualifiers +-- Test column positions where ts and te are not the last two columns. +-- Please note: This was a restriction in an early implementation. +SELECT * FROM ( + tpg_table3 ALIGN tpg_table4 + ON b = d + WITH (ts, te, ts, te) + ) x; + +-- Equality qualifiers with FQN inside ON- and WITH-clause +-- Test column positions where ts and te are not the last two columns. +-- Please note: This was a restriction in an early implementation. +SELECT * FROM ( + tpg_table3 ALIGN tpg_table4 + ON tpg_table3.b = tpg_table4.d + WITH (tpg_table3.ts, tpg_table3.te, tpg_table4.ts, tpg_table4.te) + ) x; + +-- Alignment with aggregation where targetlist of outer query is *not* A_STAR... +-- Test column positions where ts and te are not the last two columns. +-- Please note: This was a restriction in an early implementation. +SELECT a, COUNT(a) FROM ( + tpg_table3 ALIGN tpg_table4 + ON b = d + WITH (ts, te, ts, te) + ) x + GROUP BY a ORDER BY a; + +-- Test relations with differently named temporal bound attributes and relation +-- and column aliases. +SELECT * FROM ( + tpg_table1 ALIGN tpg_table2 x(c,d,s,e) + ON b = d + WITH (ts, te, s, e) + ) x; + +-- Range types for temporal boundaries, i.e., valid time columns +SELECT * FROM ( + tpg_table5 ALIGN tpg_table6 + ON TRUE + WITH (t, t) + ) x; + +-- +-- TEMPORAL ALIGNER: TEMPORAL JOIN EXAMPLE +-- + +-- Full temporal join example with absorbing where clause, timestamp +-- propagation (see CTEs targetlists with V and U) and range types +WITH t1 AS (SELECT *, t u FROM tpg_table5), + t2 AS (SELECT *, t v FROM tpg_table6) +SELECT t, b, x.a, y.a FROM ( + t1 ALIGN t2 + ON t1.b = t2.b + WITH (t, t) + ) x + LEFT OUTER JOIN ( + SELECT * FROM ( + t2 ALIGN t1 + ON t1.b = t2.b + WITH (t, t) + ) y + ) y + USING (b, t) + WHERE ( + (lower(t) = lower(u) OR lower(t) = lower(v)) + AND + (upper(t) = upper(u) OR upper(t) = upper(v)) + ) + OR u IS NULL + OR v IS NULL + ORDER BY 1,2,3,4; + +-- Full temporal join example with absorbing where clause, timestamp +-- propagation (see CTEs targetlists with V and U) and scalar VALID TIME columns +WITH t1 AS (SELECT *, ts us, te ue FROM tpg_table1), + t2 AS (SELECT *, ts vs, te ve FROM tpg_table2) +SELECT x.ts, x.te, b, a, c FROM ( + t1 ALIGN t2 + ON b = d + WITH (ts, te, ts, te) + ) x + LEFT OUTER JOIN ( + SELECT * FROM ( + t2 ALIGN t1 + ON b = d + WITH (ts, te, ts, te) + ) y + ) y + ON b = d AND x.ts = y.ts AND x.te = y.te + WHERE ( + (x.ts = us OR x.ts = vs) + AND + (x.te = ue OR x.te = ve) + ) + OR us IS NULL + OR vs IS NULL + ORDER BY 1,2,3,4; + +-- Collation and varchar boundaries +SELECT * FROM ( + tpg_table7 x ALIGN tpg_table7 y + ON TRUE + WITH (ts, te, ts, te) + ) x; + +-- Collation and varchar boundaries with incompatible collations (ERROR expected) +SELECT * FROM ( + tpg_table8 x ALIGN tpg_table8 y + ON TRUE + WITH (ts, te, ts, te) + ) x; + +-- +-- TEMPORAL ALIGNER: SELECTION PUSH-DOWN +-- + +-- VALID TIME columns are not safe to be pushed down, for the rest everything +-- should work as usual. +EXPLAIN (COSTS OFF) SELECT * FROM ( + tpg_table2 ALIGN tpg_table1 + ON TRUE + WITH (ts, te, ts, te) + ) x + WHERE c < 3; + +EXPLAIN (COSTS OFF) SELECT * FROM ( + tpg_table2 ALIGN tpg_table1 + ON TRUE + WITH (ts, te, ts, te) + ) x + WHERE c < 3 AND ts > 3; + +EXPLAIN (COSTS OFF) SELECT * FROM ( + tpg_table2 ALIGN tpg_table1 + ON TRUE + WITH (ts, te, ts, te) + ) x + WHERE c < 3 OR ts > 3; + +-- +-- TEMPORAL ALIGNER: DATA TYPES +-- + +-- Data types: Timestamps +-- We use to_char here to be sure that we have the same output format on all +-- platforms and locale configuration +SELECT a, to_char(ts, 'YYYY-MM-DD') ts, to_char(te, 'YYYY-MM-DD') te FROM ( + tpg_table9 t1 ALIGN tpg_table9 t2 + ON t1.a = 0 + WITH (ts, te, ts, te) + ) x; + +-- Data types: Double precision +SELECT a, ts, te FROM ( + tpg_table10 t1 ALIGN tpg_table10 t2 + ON t1.a = 0 + WITH (ts, te, ts, te) + ) x; + +-- Data types: Double precision with +/- infinity +SELECT a, ts, te FROM ( + tpg_table11 t1 ALIGN tpg_table11 t2 + ON t1.a = 0 + WITH (ts, te, ts, te) + ) x; + + +-- +-- TEMPORAL NORMALIZER: BASICS +-- + +-- Equality qualifiers +SELECT * FROM ( + tpg_table1 NORMALIZE tpg_table2 + ON b = d + WITH (ts, te, ts, te) + ) x; + +-- Equality qualifiers with FQN inside ON- and WITH-clause +SELECT * FROM ( + tpg_table1 NORMALIZE tpg_table2 + ON tpg_table1.b = tpg_table2.d + WITH (tpg_table1.ts, tpg_table1.te, tpg_table2.ts, tpg_table2.te) + ) x; + +-- Alignment with aggregation +-- NB: Targetlist of outer query is *not* A_STAR... +SELECT a, COUNT(a) FROM ( + tpg_table1 NORMALIZE tpg_table2 + ON b = d + WITH (ts, te, ts, te) + ) x + GROUP BY a ORDER BY a; + +-- Equality qualifiers +-- Test column positions where ts and te are not the last two columns. +-- Please note: This was a restriction in an early implementation. +SELECT * FROM ( + tpg_table3 NORMALIZE tpg_table4 + ON b = d + WITH (ts, te, ts, te) + ) x; + +-- Equality qualifiers with FQN inside ON- and WITH-clause +-- Test column positions where ts and te are not the last two columns. +-- Please note: This was a restriction in an early implementation. +SELECT * FROM ( + tpg_table3 NORMALIZE tpg_table4 + ON tpg_table3.b = tpg_table4.d + WITH (tpg_table3.ts, tpg_table3.te, tpg_table4.ts, tpg_table4.te) + ) x; + +-- Alignment with aggregation where targetlist of outer query is *not* A_STAR... +-- Test column positions where ts and te are not the last two columns. +-- Please note: This was a restriction in an early implementation. +SELECT a, COUNT(a) FROM ( + tpg_table3 NORMALIZE tpg_table4 + ON b = d + WITH (ts, te, ts, te) + ) x + GROUP BY a ORDER BY a; + +-- Test relations with differently named temporal bound attributes and relation +-- and column aliases. +SELECT * FROM ( + tpg_table1 NORMALIZE tpg_table2 x(c,d,s,e) + ON b = d + WITH (ts, te, s, e) + ) x; + +-- Normalizer's USING clause (self-normalization) +SELECT * FROM ( + tpg_table1 t1 NORMALIZE tpg_table1 t2 + USING (a) + WITH (ts, te, ts, te) + ) x; + +-- Range types for temporal boundaries, i.e., valid time columns +SELECT * FROM ( + tpg_table5 NORMALIZE tpg_table6 + USING (b) + WITH (t, t) + ) x; + +-- Collation and varchar boundaries +SELECT * FROM ( + tpg_table7 x NORMALIZE tpg_table7 y + ON TRUE + WITH (ts, te, ts, te) + ) x; + +-- Collation and varchar boundaries with incompatible collations (ERROR expected) +SELECT * FROM ( + tpg_table8 x NORMALIZE tpg_table8 y + ON TRUE + WITH (ts, te, ts, te) + ) x; + +-- +-- TEMPORAL NORMALIZER: SELECTION PUSH-DOWN +-- + +-- VALID TIME columns are not safe to be pushed down, for the rest everything +-- should work as usual. + +EXPLAIN (COSTS OFF) SELECT * FROM ( + tpg_table2 NORMALIZE tpg_table1 + ON TRUE + WITH (ts, te, ts, te) + ) x + WHERE c < 3; + +EXPLAIN (COSTS OFF) SELECT * FROM ( + tpg_table2 NORMALIZE tpg_table1 + ON TRUE + WITH (ts, te, ts, te) + ) x + WHERE c < 3 AND ts > 3; + +EXPLAIN (COSTS OFF) SELECT * FROM ( + tpg_table2 NORMALIZE tpg_table1 + ON TRUE + WITH (ts, te, ts, te) + ) x + WHERE c < 3 OR ts > 3; + +-- +-- TEMPORAL NORMALIZER: DATA TYPES +-- + +-- Data types: Timestamps +-- We use to_char here to be sure that we have the same output format on all +-- platforms and locale configuration +SELECT a, to_char(ts, 'YYYY-MM-DD') ts, to_char(te, 'YYYY-MM-DD') te FROM ( + tpg_table9 t1 NORMALIZE tpg_table9 t2 + ON t1.a = 0 + WITH (ts, te, ts, te) + ) x; + +-- Data types: Double precision +SELECT a, ts, te FROM ( + tpg_table10 t1 NORMALIZE tpg_table10 t2 + ON t1.a = 0 + WITH (ts, te, ts, te) + ) x; + +-- Data types: Double precision with +/- infinity +SELECT a, ts, te FROM ( + tpg_table11 t1 NORMALIZE tpg_table11 t2 + ON t1.a = 0 + WITH (ts, te, ts, te) + ) x; + +-- +-- TEMPORAL ALIGNER AND NORMALIZER: VIEWS +-- + +-- Views with temporal normalization +CREATE TEMP VIEW v AS SELECT * FROM ( + tpg_table1 NORMALIZE tpg_table2 + ON b = d + WITH (ts, te, ts, te) + ) x; + +TABLE v; +DROP VIEW v; + +-- Views with temporal alignment +CREATE TEMP VIEW v AS SELECT * FROM ( + tpg_table1 ALIGN tpg_table2 + ON b = d + WITH (ts, te, ts, te) + ) x; + +TABLE v; +DROP VIEW v; + +-- Testing temporal normalization with ambiguous columns, i.e. columns that +-- are used internally... +CREATE TEMP VIEW v AS SELECT * FROM ( + tpg_table1 AS r(p1, p1_0, "p1_-1", p1_1) NORMALIZE tpg_table2 s + ON r.p1_0 = s.d + WITH ("p1_-1", p1_1, ts, te) + ) x; + +TABLE v; +DROP VIEW v; + +-- Testing temporal alignment with ambiguous columns, i.e. columns that +-- are used internally... +CREATE TEMP VIEW v AS SELECT * FROM ( + tpg_table1 AS r(p1, p1_0, "p1_-1", p1_1) ALIGN tpg_table2 s + ON r.p1_0 = s.d + WITH ("p1_-1",p1_1,ts,te) + ) x; + +TABLE v; +DROP VIEW v; + +
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers