Hi Timo, Thanks for this FLIP, all three improvements address real production needs. One question on the broadcast state design: The FLIP describes NOTIFY_STATEFUL_SETS as re-invoking eval() for each existing stateful set with a set key context. The collect() prohibition is scoped to broadcast-state-only processing, so I assume collect() is permitted during the per-key NOTIFY re-evaluation, is that correct? If so, it enables a useful pattern: broadcast a rule change and immediately re-emit corrected results across all existing keys (accepting the cost of full key iteration).
Best, Natea On Fri, Mar 6, 2026 at 1:13 AM Timo Walther <[email protected]> wrote: > Hi everyone, > > if there are not objections, I would start a VOTE on Monday. > > Thanks, > Timo > > On 05.03.26 10:02, Gustavo de Morais wrote: > > Hi Timo, > > > > Thank you for proposing these improvements. All address real pain points, > > so +1. It's especially good to see BROADCAST_SEMANTIC_TABLE. This > unlocks > > a set of use cases for use cases involving small lookup tables that can > be > > considerably optimized. I'm also +1 on supporting ORDER BY instead of an > > additional argument trait. > > > > Thanks for continuing to push PTFs forward - they are becoming really > > powerful. > > > > Kind regards, > > Gustavo > > > > On Wed, 4 Mar 2026 at 16:40, Ryan van Huuksloot via dev < > > [email protected]> wrote: > > > >> That makes sense to me. First make it work; then, make it easy. > >> > >> Otherwise the FLIP looks good to me. Some great improvements! Thanks for > >> putting this together. > >> > >> Ryan van Huuksloot > >> Staff Engineer, Infrastructure | Streaming Platform > >> [image: Shopify] > >> < > https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email> > >> > >> > >> On Wed, Mar 4, 2026 at 9:22 AM Timo Walther <[email protected]> wrote: > >> > >>> Hi Ryan, > >>> > >>> thanks for the great feedback. I agree that some parts might still be > >>> too complex, usability is definitely a continuous effort. For now, the > >>> main goal of PTFs was to unblock people when something cannot be > >>> expressed with SQL or would lead to very inefficient query plans. Also > >>> they rather target a developer persona. Usually, a platform team that > >>> develops PTFs for SQL personas. In the mid-term, I hope that AI will > >>> implement most of the PTFs. So exposing engine primitives / building > >>> blocks for AI is crucial. > >>> > >>> Maybe we can also offer a SimpleProcessFunction at some point, once we > >>> know better why and how people use PTFs. Also having more built-in PTFs > >>> that address the most frequent tasks can be very helpful. > >>> > >>> Please continue sharing your experiences: What are frequent tasks? What > >>> do users want to achieve with PTFs? > >>> > >>> Cheers, > >>> Timo > >>> > >>> On 03.03.26 21:09, Ryan van Huuksloot via dev wrote: > >>>> Hi Timo, > >>>> > >>>> Thanks for the FLIP. > >>>> > >>>> Internally, we've started using PTFs and are still figuring out how to > >>> best > >>>> leverage them. > >>>> The improvements you proposed in your FLIP are great. > >>>> I wanted to mention the priority order for the 3 improvements you've > >>>> recommended. I would prioritize them in the order you stated, based on > >>> our > >>>> usage. So far I haven't had any broadcast requests but I'm sure > they're > >>>> coming. The late arriving data will be very helpful. > >>>> > >>>> My primary concern with PTFs and large state is generally the > >> complexity > >>> of > >>>> the state decisions. Most of our SQL developers won't understand when > >> to > >>>> use a "[Map][List][Value]View" with a PTF. Specifically this area in > >> the > >>>> documentation: > >>>> > >>> > >> > https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/dev/table/functions/ptfs/#large-state > >>>> You really need to understand Java concepts to grasp the intricacies > of > >>>> your decisions when choosing a state mechanism. I wonder if we can > >>> simplify > >>>> this decision for engineers who may not be Flink and Java experts. It > >> may > >>>> not be possible. > >>>> > >>>> Ryan van Huuksloot > >>>> Staff Engineer, Infrastructure | Streaming Platform > >>>> [image: Shopify] > >>>> < > >> https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email > >>>> > >>>> > >>>> > >>>> On Tue, Mar 3, 2026 at 3:47 AM Timo Walther <[email protected]> > >> wrote: > >>>> > >>>>> Hi everyone, > >>>>> > >>>>> Just bumping this thread again and happy to gather any feedback you > >>> have. > >>>>> > >>>>> Thanks, > >>>>> Timo > >>>>> > >>>>> On 16.02.26 09:35, Timo Walther wrote: > >>>>>> Hi everyone, > >>>>>> > >>>>>> the ProcessTableFunction (PTF) feature has been well received by the > >>>>>> Flink community and its adoption is increasing. Since FLIP-440 [1] > >>>>>> introduced a lot of new API and new concepts, some design decisions > >>> need > >>>>>> smaller adjustments along late data handling and lazy state access. > >>>>>> > >>>>>> Also, talking to community members at Current and Flink Forward > >>>>>> conferences has shown that broadcast state is crucial to bridge the > >> gap > >>>>>> to DataStream API applications for broadcast joining and rule-based > >>>>> logic. > >>>>>> > >>>>>> I would like to propose FLIP-565: Improve ProcessTableFunctions for > >>> late > >>>>>> data handling and state access" [2]. > >>>>>> > >>>>>> This FLIP proposes 3 important PTF improvements: > >>>>>> > >>>>>> 1) Don’t drop late data in ProcessFunction as data-loss is usually > >> not > >>>>>> intended; similar to DataStream API’s ProcessFunction > >>>>>> > >>>>>> 2) Introduce ValueView to enable a “supplier”-pattern for state > >> access; > >>>>>> similar to MapView and ListView > >>>>>> > >>>>>> 3) Introduce BROADCAST_SEMANTIC_TABLE as a new kind of argument to > >> PTFs > >>>>>> > >>>>>> Regarding forward compatibility, all proposed items can be made > >>>>>> available in batch mode eventually for a unified experience. From my > >>>>>> point of view, these remaining adjustments should make PTF fully > >>>>>> production ready, I don't expect any major additions in the > mid-term. > >>>>>> > >>>>>> Looking forward to your feedback. > >>>>>> > >>>>>> Thanks, > >>>>>> Timo > >>>>>> > >>>>>> [1] https://cwiki.apache.org/confluence/x/pQnPEQ > >>>>>> [2] https://cwiki.apache.org/confluence/x/qIo8G > >>>>>> > >>>>> > >>>>> > >>>> > >>> > >>> > >> > > > >
