Hi, all
I open a new discussion of FLIP-132[1] which based on our consensus on current
thread.
Let me keep communication in the new thread, please let me know if you have any
concerns.
Best
Leonard
[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-132-Temporal-Ta
* I mistyped the rejected_query, it should be
CREATE VIEW AS post_agg_stream SELECT currencyId, AVG(rate)* as *rate*
FROM *currency_rates
CREATE VIEW AS rejected_query
SELECT
...FROM
transactions AS t
JOIN post_agg_stream FOR SYSTEM_TIME AS OF t.transactionTime AS r
ON r.currency = t.curr
Hey Leonard,
Agreed, this is a fun discussion!
(1) For support changelog source backed CDC tools, a problem is that can we
> use the temporal table as a general source table which may followed by some
> aggregation operations, more accurate is wether the aggregation operator
> can use the DELETE
Hi everyone,
Thanks a lot for the great discussions so far.
After reading through the long discussion, I still have one question.
Currently the temporal table function supports both event time and proc
time joining.
If we use "FOR SYSTEM_TIME AS OF" syntax without "TEMPORAL" keyword in DDL,
does
Hi, Seth
Thanks for your explanation of user cases, and you’re wright the look up
join/table is one kind of temporal table join/table which tracks latest
snapshot of external DB-like tables, it's why we proposed use same temporal
join syntax.
In fact, I have invested and checked Debezuim form
As an aside, I conceptually view temporal table joins to be semantically
equivalent to look up table joins. They are just two different ways of
consuming the same data.
Seth
On Mon, Jul 6, 2020 at 8:56 AM Seth Wiesman wrote:
> Hi Leonard,
>
> Regarding DELETE operations I tend to have the oppos
Hi Leonard,
Regarding DELETE operations I tend to have the opposite reaction. I spend a
lot of time working with production Flink users across a large number of
organizations and to say we don't support temporal tables that include
DELETEs will be a blocker for adoption. Even organizations that cl
Hi, Konstantin
> . Would we support a temporal join with a changelog stream with
> event time semantics by ignoring DELETE messages or would it be completed
> unsupported.
I don’t know the percentage of this feature in temporal scenarios.
Comparing to support the approximate event time join by i
Hi Leonard,
Thank you for the summary. I don't fully understand the implications of
(3). Would we support a temporal join with a changelog stream with
event time semantics by ignoring DELETE messages or would it be completed
unsupported. I mean something like the following sequence of statements:
Thanks Jingsong, Jark, Knauf, Seth for sharing your thoughts.
Although we discussed many details about the concept, I think it’s worth to
clarify the semantic from long term goals. Temporal table concept was first
imported in SQL:2011, I made some investigation of Temporal Table work
mechanism
It is clear there are a lot of edge cases with temporal tables that need to
be carefully thought out. If we go at this problem from the perspective of
what a majority of users need to accomplish in production, I believe there
is a simpler version of this problem we can solve that can be expanded in
Hi everyone,
well, this got complicated :) Let me add my thoughts:
* Temporal Table Joins are already quite hard to understand for many users.
If need be, we should trade off for simplicity.
* The important case is the *event time *temporal join. In my understanding
processing time temporal join
Thanks for your discussion.
Looks like the problem is supporting the versioned temporal table for the
changelog source.
I want to share more of my thoughts:
When I think about changelog sources, I treat it as a view like: "CREATE
VIEW changelog_table AS SELECT ... FROM origin_table GROUP BY ..."
Hi all,
Thanks Leonard for summarizing our discussion. I want to share more of my
thoughts:
* rowtime is a column in the its schema, so the rowtime of DELETE event is
the value of the previous image.
* operation time is the time when the DML statements happen in databases,
so the operation time o
Hi, kurt, Fabian
After an offline discussion with Jark, We think that the 'PERIOD FOR
SYSTEM_TIME(operation_time)' statement might be needed now. Changelog table is
superset of insert-only table, use PRIMARY KEY and rowtime may work well in
insert-only or upsert source but has some problem in
Hi, everyone
Thanks Fabian,Kurt for making the multiple version(event time) clear, I also
like the 'PERIOD FOR SYSTEM' syntax which supported in SQL standard. I think we
can add some explanation of the multiple version support in the future section
of FLIP.
For the PRIMARY KEY semantic, I agre
Thanks Kurt,
Yes, you are right.
The `PERIOD FOR SYSTEM_TIME` that you linked before corresponds to the
VERSION clause that I used and would explicitly define the versioning of a
table.
I didn't know that the `PERIOD FOR SYSTEM_TIME` cause is already defined by
the SQL standard.
I think we would n
Hi Fabian,
I agree with you that implicitly letting event time to be the version of
the table will
work in most cases, but not for all. That's the reason I mentioned `PERIOD
FOR` [1]
syntax in my first email, which is already in sql standard to represent the
validity of
each row in the table.
If
Hi everyone,
Every table with a primary key and an event-time attribute provides what is
needed for an event-time temporal table join.
I agree that, from a technical point of view, the TEMPORAL keyword is not
required.
I'm more sceptical about implicitly deriving the versioning information of
a (
I'm also +1 for not adding the TEMPORAL keyword.
+1 to make the PRIMARY KEY semantic clear for sources.
>From my point of view:
1) PRIMARY KEY on changelog souruce:
It means that when the changelogs (INSERT/UPDATE/DELETE) are materialized,
the materialized table should be unique on the primary ke
Hi everyone,
I also agree with Leonard/Kurt's proposal for CREATE TEMPORAL TABLE.
Best,
Konstantin
On Mon, Jun 22, 2020 at 10:53 AM Kurt Young wrote:
> I agree with Timo, semantic about primary key needs more thought and
> discussion, especially after FLIP-95 and FLIP-105.
>
> Best,
> Kurt
>
I agree with Timo, semantic about primary key needs more thought and
discussion, especially after FLIP-95 and FLIP-105.
Best,
Kurt
On Mon, Jun 22, 2020 at 4:45 PM Timo Walther wrote:
> Hi Leonard,
>
> thanks for the summary.
>
> After reading all of the previous arguments and working on FLIP-9
Hi Leonard,
thanks for the summary.
After reading all of the previous arguments and working on FLIP-95. I
would also lean towards the conclusion of not adding the TEMPORAL keyword.
After FLIP-95, what we considered as a CREATE TEMPORAL TABLE can be
represented as a CREATE TABLE with PRIMARY
Hi everyone,
Thanks for the nice discussion. I’d like to move forward the work, please let
me simply summarize the main opinion and current divergences.
1. The agreements have been achieved:
1.1 The motivation we're discussing temporal table DDL is just for creating
temporal table in pure SQL
Thanks for sharing your opinion. I can see there are some very small
divergences we had through your description. I think it would be a good
idea to first discuss these first.
Let's first put aside table version for now, and only discuss about whether
a DDL table should be treated as a DMBS style
I think Flink should behave similar to other DBMSs.
Other DBMS do not allow to query the history of a table, even though the
DBMS has seen all changes of the table (as transactions or directly as a
changelog if the table was replicated) and recorded them in its log.
You need to declare a table as
All tables being described by Flink's DDL are dynamic tables. But dynamic
table is more like a logical concept, but not physical things.
Physically, dynamic table has two different forms, one is a materialized
table which changes over time (e.g. Database table, HBase table),
another form is stream
I think we need the TEMPORAL TABLE syntax because they are conceptually
more than just regular tables.
In a addition to being a table that always holds the latest values (and can
thereby serve as input to a continuous query), the system also needs to
track the history of such a table to be able to
I might missed something but why we need a new "TEMPORAL TABLE" syntax?
According to Fabian's first mail:
> Hence, the requirements for a temporal table are:
> * The temporal table has a primary key / unique attribute
> * The temporal table has a time-attribute that defines the start of the
> val
Hi,
I agree what Fabian said above.
Besides, IMO, (3) is in a lower priority and will involve much more things.
It makes sense to me to do it in two-phase.
Regarding to (3), the key point to convert an append-only table into
changelog table is that the framework should know the operation type,
so
Thanks for the summary Konstantin.
I think you got all points right.
IMO, the way forward would be to work on a FLIP to define
* the concept of temporal tables,
* how to feed them from retraction tables
* how to feed them from append-only tables
* their specification with CREATE TEMPORAL TABLE,
*
Hi everyone,
Thanks everyone for joining the discussion on this. Please let me summarize
what I have understood so far.
1) For joining an append-only table and a temporal table the syntax the "FOR
SYSTEM_TIME AS OF " seems to be preferred (Fabian, Timo,
Seth).
2) To define a temporal table based
Hi Fabian,
Just to clarify a little bit, we decided to move the "converting
append-only table into changelog table" into future work.
So FLIP-105 only introduced some CDC formats (debezium) and new TableSource
interfaces proposed in FLIP-95.
I should have started a new FLIP for the new CDC formats
Thanks Jark!
I certainly need to read up on FLIP-105 (and I'll try to adjust my
terminology to changelog table from now on ;-) )
If FLIP-105 addresses the issue of converting an append-only table into a
changelog table that upserts on primary key (basically what the VIEW
definition in my first ema
Hi Fabian,
I think converting an append-only table into temporal table contains two
things:
(1) converting append-only table into changelog table (or retraction table
as you said)
(2) define the converted changelog table (maybe is a view now) as temporal
(or history tracked).
The first thing is a
Hi,
I agree with most of what Timo said.
The TEMPORAL keyword (which unfortunately might be easily confused with
TEMPORARY...) looks very intuitive and I think using the only time
attribute for versioning would be a good choice.
However, TEMPORAL TABLE on retraction tables do not solve the full
I really like the TEMPORAL keyword, I find it very intuitive.
The down side of this approach would be that an additional preprocessing
> step would not be possible anymore because there is no preceding view.
>
Yes and no. My understanding is we are not talking about making any
changes to how tem
Hi Fabian,
thank you very much for this great summary!
I wasn't aware of the Polymorphic Table Functions standard. This is a
very interesting topic that we should definitely consider in the future.
Maybe this could also help us in defining tables more dynamically within
a query. It could help
Hi all,
First of all, I appologize for the text wall that's following... ;-)
A temporal table join joins an append-only table and a temporal table.
The question about how to represent a temporal table join boils down to two
questions:
1) How to represent a temporal table
2) How to specify the jo
Hi Konstantin,
Thanks for bringing this discussion. I think temporal join is a very
important feature and should be exposed to pure SQL users.
And I already received many requirements like this.
However, my concern is that how to properly support this feature in SQL.
Introducing a DDL syntax for T
Hi Konstantin,
Thanks for bringing up this discussion. +1 for the idea.
We have met this in our company too, and I planned to support it recently
in our internal branch.
regarding to your questions,
1) I think it might be more a table/view than function, just like Temporal
Table (which is also kn
Hi everyone,
it would be very useful if temporal tables could be created via DDL.
Currently, users either need to do this in the Table API or in the
environment file of the Flink CLI, which both require the user to switch
the context of the SQL CLI/Editor. I recently created a ticket for this
req
42 matches
Mail list logo