Hi Timo, Thanks for the FLIP! Also, thanks for saying that this has been in your head for a few years; there is a ton here.
1. For the pass through semantics, if the partition columns are already listed in the pass through column, they are not duplicated right? 2. A number of places mention that Calcite doesn't support XYZ. Do we have tickets for that work? 3. I like the scoping section. Relative to "More than two tables is out of scope for this FLIP.", will need to change any of the interfaces in a major way to support multiple output tables in the future? 4. The section "Empty Semantics" under Scoping is a bit terse and I'll admit that I don't understand it. Could you say more in the FLIP there? 5. For Pass Through Semantics, will adding PASS THROUGH later be easy to do? Or is there some reason to avoid it completely? 6. nit: Under `TimedLastValue`, there is a line which is copied from the above "The on_time descriptor takes two arguments in this case to forward the time attributes of each side." 7. Will upsert mode be possible later? Can you say more about why upsert is not supported? (I can guess, but it seems like a brief discussion in the FLIP would be useful.) 8. In terms of the two bullets at the end migration plan, I am for both changing the order of SESSION window colums and changing the name from TIMECOL to on_time (or support both names?). Is there any downside to doing so? Thanks, Jim On Mon, Sep 23, 2024 at 6:38 PM Timo Walther <twal...@apache.org> wrote: > Hi everyone, > > I'm super excited to start a discussion about FLIP-440: User-defined SQL > operators / ProcessTableFunction (PTF) [1]. > > This FLIP has been in my head for many years and Flink 2.0 is a good > time to open up Flink SQL to a new category of use cases. Following the > principle "Make simple things easy, and complex ones possible", I would > like to propose a new UDF type "ProcessTableFunction" that gives users > access to Flink's primitives for advanced stream processing. This should > unblock people when hitting shortcomings in Flink SQL and expand the > scope of SQL from analytical to more event-driven applications. > > This proposal is by no means a full replacement of DataStream API. > DataStream API will always provide the full power of Flink whereas PTFs > provide at least a necessary toolbox to cover ~80% of all use cases > without leaving the SQL ecosystem. The SQL ecosystem is a great > foundation with well-defined type system, catalog integration, CDC > support, and built-in functions/operators. PTFs complete it by offering > a standard compliant extension point. > > Looking forward to your feedback. > > Thanks, > Timo > > [1] https://cwiki.apache.org/confluence/x/pQnPEQ >