+1 to the updated design. I agree with Fabian that the naming of "temporal table without version" is a bit confusing but the actual semantics make sense to me. I think just saying its a Flink managed lookup join makes sense.
Seth On Tue, Aug 18, 2020 at 3:07 PM Fabian Hueske <fhue...@gmail.com> wrote: > Thanks for the updated FLIP Leonard! > In my opinion this was an improvement. > So +1 for this design. > > I have just one remark regarding the terminology. > I find the term "Temporal Table without Version" somewhat confusing. > IMO, versions are the core principle of temporal tables and temporal > tables without versions don't make much sense to me. > > What makes such a table a "Temporal" table? Isn't it just a regular table? > If I understand the proposal correctly, "Temporal Tables without Version" > can only be used in processing time temporal table joins, because this join > only requests the current version. > But all regular tables can be used in processing time (temporal) table > joins as well. > It's basically the same as a lookup join, with the only difference that > the table is maintained in Flink and not accessed in an external system > (for example via JDBC). > > Are "Temporal Tables without Version" called "Temporal" because they can > be used in "processing time temporal table joins" and due to its name this > join needs to join something that's called "Temporal"? > In that case, we might want to rename "processing time temporal table > joins" into something else that does not imply a versioning. > Maybe we can call them just lookup joins to avoid introducing another term? > > Thanks, Fabian > > Am Di., 18. Aug. 2020 um 04:30 Uhr schrieb Rui Li <lirui.fu...@gmail.com>: > >> Thanks Leonard for the clarifications! >> >> On Mon, Aug 17, 2020 at 9:17 PM Leonard Xu <xbjt...@gmail.com> wrote: >> >>> >>> > But are we still able to track different views of such a >>> > table through time, as rows are added/deleted to/from the table? >>> >>> Yes, in fact we support temporal table from changlog which contains all >>> possible message types(INSERT/UPDATE/DELETE). >>> >>> > For >>> > example, suppose I have an append-only table source with event-time >>> and PK, >>> > will I be allowed to do an event-time temporal join with this table? >>> Yes, I list some examples in the doc, the example versioned_rates3 is >>> this case exactly. >>> >>> Best >>> Leonard >>> >>> >>> > >>> > On Wed, Aug 12, 2020 at 3:31 PM Leonard Xu <xbjt...@gmail.com <mailto: >>> xbjt...@gmail.com>> wrote: >>> > >>> >> Hi, all >>> >> >>> >> After a detailed offline discussion about the temporal table related >>> >> concept and behavior, we had a reliable solution and rejected several >>> >> alternatives. >>> >> Compared to rejected alternatives, the proposed approach is a more >>> unified >>> >> story and also friendly to user and current Flink framework. >>> >> I improved the FLIP[1] with the proposed approach and refactored the >>> >> document organization to make it clear enough. >>> >> >>> >> Please let me know if you have any concerns, I’m looking forward your >>> >> comments. >>> >> >>> >> >>> >> Best >>> >> Leonard >>> >> >>> >> [1] >>> >> >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL >>> >> < >>> >> >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL >>> < >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL >>> > >>> >>> >>> >> >>> >> >>> >>> 在 2020年8月4日,21:25,Leonard Xu <xbjt...@gmail.com <mailto: >>> xbjt...@gmail.com>> 写道: >>> >>> >>> >>> Hi, all >>> >>> >>> >>> I’ve updated the FLIP[1] with the terminology `ChangelogTime`. >>> >>> >>> >>> Best >>> >>> Leonard >>> >>> [1] >>> >> >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL >>> < >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL >>> > >>> >> < >>> >> >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL >>> < >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-132+Temporal+Table+DDL >>> > >>> >>> >>> >>> >>> >>>> 在 2020年8月4日,20:58,Leonard Xu <xbjt...@gmail.com <mailto: >>> xbjt...@gmail.com> <mailto: >>> >> xbjt...@gmail.com <mailto:xbjt...@gmail.com>>> 写道: >>> >>>> >>> >>>> Hi, Timo >>> >>>> >>> >>>> Thanks for you response. >>> >>>> >>> >>>>> 1) Naming: Is operation time a good term for this concept? If I >>> read >>> >> "The operation time is the time when the changes happened in system." >>> or >>> >> "The system time of DML execution in database", why don't we call it >>> >> `ChangelogTime` or `SystemTime`? Introducing another terminology of >>> time in >>> >> Flink should be thought through. >>> >>>> >>> >>>> I agree that we should thought through. I have considered the name >>> >> `ChangelogTime` and `SystemTime` too, I don’t have strong opinion on >>> the >>> >> name. >>> >>>> >>> >>>> I proposed `operationTime` because most changelog comes from >>> Database >>> >> and we always called an action as `operation` rather than `change` in >>> >> Database, the operation time is easier to understand for database >>> users, >>> >> but it's more like a database terminology. >>> >>>> >>> >>>> For `SystemTime`, user may confuse which one does the system in >>> >> `SystemTime` represents? Flink, Database or CDC tool. Maybe it’s >>> not a >>> >> good name. >>> >>>> >>> >>>> `ChangelogTime` is a pretty choice which is more unified with >>> existed >>> >> terminology `Changelog` and `ChangelogMode`, so let me use >>> `ChangelogTime` >>> >> and I’ll update the FLIP. >>> >>>> >>> >>>> >>> >>>>> 2) Exposing it through `org.apache.flink.types.Row`: Shall we also >>> >> expose the concept of time through the user-level `Row` type? The >>> FLIP does >>> >> not mention this explictly. I think we can keep it as an internal >>> concept >>> >> but I just wanted to ask for clarification. >>> >>>> >>> >>>> Yes, I want to keep it as an internal concept, we have discussed >>> that >>> >> changelog time concept should be the third time concept(the other two >>> are >>> >> event-time and processing-time). It’s not easy for normal users(or to >>> help >>> >> normal users) understand the three concepts accurately, and I did not >>> find >>> >> a big enough scenario that user need to touch the changelog time for >>> now, >>> >> so I tend to do not expose the concept to users. >>> >>>> >>> >>>> >>> >>>> Best, >>> >>>> Leonard >>> >>>> >>> >>>> >>> >>>>> >>> >>>>> On 04.08.20 04:58, Leonard Xu wrote: >>> >>>>>> Thanks Konstantin, >>> >>>>>> Regarding your questions, hope my comments has address your >>> questions >>> >> and I also add a few explanation in the FLIP. >>> >>>>>> Thank you all for the feedback, >>> >>>>>> It seems everyone involved in this thread has reached a >>> consensus. >>> >>>>>> I will start a vote thread later. >>> >>>>>> Best, >>> >>>>>> Leonard >>> >>>>>>> 在 2020年8月3日,19:35,godfrey he <godfre...@gmail.com <mailto: >>> godfre...@gmail.com> <mailto: >>> >> godfre...@gmail.com <mailto:godfre...@gmail.com>>> 写道: >>> >>>>>>> >>> >>>>>>> Thanks Lennard for driving this FLIP. >>> >>>>>>> Looks good to me. >>> >>>>>>> >>> >>>>>>> Best, >>> >>>>>>> Godfrey >>> >>>>>>> >>> >>>>>>> Jark Wu <imj...@gmail.com <mailto:imj...@gmail.com> <mailto: >>> imj...@gmail.com <mailto:imj...@gmail.com>>> 于2020年8月3日周一 >>> >> 下午12:04写道: >>> >>>>>>> >>> >>>>>>>> Thanks Leonard for the great FLIP. I think it is in very good >>> shape. >>> >>>>>>>> +1 to start a vote. >>> >>>>>>>> >>> >>>>>>>> Best, >>> >>>>>>>> Jark >>> >>>>>>>> >>> >>>>>>>> On Fri, 31 Jul 2020 at 17:56, Fabian Hueske <fhue...@gmail.com >>> <mailto:fhue...@gmail.com> >>> >> <mailto:fhue...@gmail.com <mailto:fhue...@gmail.com>>> wrote: >>> >>>>>>>> >>> >>>>>>>>> Hi Leonard, >>> >>>>>>>>> >>> >>>>>>>>> Thanks for this FLIP! >>> >>>>>>>>> Looks good from my side. >>> >>>>>>>>> >>> >>>>>>>>> Cheers, Fabian >>> >>>>>>>>> >>> >>>>>>>>> Am Do., 30. Juli 2020 um 22:15 Uhr schrieb Seth Wiesman < >>> >>>>>>>>> sjwies...@gmail.com <mailto:sjwies...@gmail.com> <mailto: >>> sjwies...@gmail.com <mailto:sjwies...@gmail.com>> >>> >>>>>>>>>> : >>> >>>>>>>>> >>> >>>>>>>>>> Hi Leondard, >>> >>>>>>>>>> >>> >>>>>>>>>> Thank you for pushing this, I think the updated syntax looks >>> >> really >>> >>>>>>>> good >>> >>>>>>>>>> and the semantics make sense to me. >>> >>>>>>>>>> >>> >>>>>>>>>> +1 >>> >>>>>>>>>> >>> >>>>>>>>>> Seth >>> >>>>>>>>>> >>> >>>>>>>>>> On Wed, Jul 29, 2020 at 11:36 AM Leonard Xu < >>> xbjt...@gmail.com <mailto:xbjt...@gmail.com> >>> >> <mailto:xbjt...@gmail.com <mailto:xbjt...@gmail.com>>> wrote: >>> >>>>>>>>>> >>> >>>>>>>>>>> Hi, Konstantin >>> >>>>>>>>>>> >>> >>>>>>>>>>>> >>> >>>>>>>>>>>> 1) A "Versioned Temporal Table DDL on source" can only be >>> >> joined >>> >>>>>>>> on >>> >>>>>>>>>> the >>> >>>>>>>>>>>> PRIMARY KEY attribute, correct? >>> >>>>>>>>>>> Yes, the PRIMARY KEY would be join key. >>> >>>>>>>>>>> >>> >>>>>>>>>>>> >>> >>>>>>>>>>>> 2) Isn't it the time attribute in the ORDER BY clause of the >>> >> VIEW >>> >>>>>>>>>>> definition that defines >>> >>>>>>>>>>>> whether a event-time or processing time temporal table join >>> is >>> >>>>>>>> used? >>> >>>>>>>>>>> >>> >>>>>>>>>>> I think event-time or processing-time temporal table join >>> >> depends on >>> >>>>>>>>> fact >>> >>>>>>>>>>> table’s time attribute in temporal join rather than from >>> temporal >>> >>>>>>>> table >>> >>>>>>>>>>> side, the event-time or processing time in temporal table is >>> just >>> >>>>>>>> used >>> >>>>>>>>> to >>> >>>>>>>>>>> split the validity period of versioned snapshot of temporal >>> >> table. >>> >>>>>>>> The >>> >>>>>>>>>>> processing time attribute is not necessary for temporal >>> table >>> >>>>>>>> without >>> >>>>>>>>>>> version, only the primary key is required, the following >>> VIEW is >>> >> also >>> >>>>>>>>>> valid >>> >>>>>>>>>>> for temporal table without version. >>> >>>>>>>>>>> CREATE VIEW latest_rates AS >>> >>>>>>>>>>> SELECT currency, LAST_VALUE(rate) -- only keep the >>> >> latest >>> >>>>>>>>>>> version >>> >>>>>>>>>>> FROM rates >>> >>>>>>>>>>> GROUP BY currency; -- inferred >>> primary >>> >> key >>> >>>>>>>>>>> >>> >>>>>>>>>>> >>> >>>>>>>>>>>> >>> >>>>>>>>>>>> 3) A "Versioned Temporal Table DDL on source" is always >>> >> versioned >>> >>>>>>>> on >>> >>>>>>>>>>>> operation_time regardless of the lookup table attribute >>> >> (event-time >>> >>>>>>>>> or >>> >>>>>>>>>>>> processing time attribute), correct? >>> >>>>>>>>>>> >>> >>>>>>>>>>> >>> >>>>>>>>>>> Yes, the semantics of `FOR SYSTEM_TIME AS OF o.time` is >>> using the >>> >>>>>>>>> o.time >>> >>>>>>>>>>> value to lookup the version of the temporal table. >>> >>>>>>>>>>> For fact table has the processing time attribute, it means >>> only >>> >>>>>>>> lookup >>> >>>>>>>>>> the >>> >>>>>>>>>>> latest version of temporal table and we can do some >>> optimization >>> >> in >>> >>>>>>>>>>> implementation like only keep the latest version. >>> >>>>>>>>>>> >>> >>>>>>>>>>> >>> >>>>>>>>>>> Best >>> >>>>>>>>>>> Leonard >>> >>>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>> >>> >>>>> >>> >>>> >>> >>> >>> >> >>> >> >>> > >>> > -- >>> > Best regards! >>> > Rui Li >>> >>> >> >> -- >> Best regards! >> Rui Li >> >