One more question, what's the motivation and what do you want to do in the part `replace beam local execution`? Not sure if you want to improve the debugging experience. It supports loopback mode [1] in PyFlink which allows debugging the Python UDF in the IDE without any setup (just setting breakpoint and running the job).
Regards, Dian [1] https://issues.apache.org/jira/browse/FLINK-21222 On Tue, Jul 1, 2025 at 10:25 AM Dian Fu <dian0511...@gmail.com> wrote: > > Hi Zander, > > Thanks for the reply. Makes sense to me! > > Some follow-up questions: > 1) Are there follow-up sub-FLIP discussions? I'm asking this because > this doc seems more like an umbrella which shapes the whole picture on > what we want to do. For example, it includes things we could just do > without voting, e.g. async scalar function and table function > support, window TVF support, etc. It also contains things which seem > like a big story and deserve a whole design doc, e.g. Data Exploration > and EDA support, the inference UDF, the Numpy Types, etc. > > 2) What do you mean on the parts of inference UDF and Numpy types? > Could you explain a bit more on these parts. > > Regards, > Dian > > > On Tue, Jul 1, 2025 at 6:37 AM Zander Matheson <a.w.mathe...@gmail.com> wrote: > > > > Thanks Dian Fu, > > > > On 1) This is more in reference to how if we modify things like the builder > > pattern, we could end up changing certain configuration patterns. I will > > reframe this to say - Currently there are no expected interfaces that will > > be removed, but as the work evolves there may be some required changes to > > determined non-pythonic areas, but the best effort will be made to mirror > > or maintain an escape hatch. > > > > 2) I understand the desire to limit the scope here because we don't want to > > go down the Pandas parity rabbit hole. Would it suffice to limit the scope > > to foundational operations (Creation, Inspection and I/O) and Core > > manipulation (Selection, Indexing and Filtering). These could be further > > outlined in issues. Maybe the following would suffice for this FLIP: > > > > Dataframe Methods > > > > Add friendly dataframe methods for creation, inspection and I/O that exist > > in other data libraries like .read_json(), read_csv(), .head(), .show() and > > .display(). > > > > Reference table columns as attributes > > > > Allow pandas-like table.<my-col> reference in addition to col(“<my-col>”) > > for all table API arguments > > > > Kwargs aliasing > > > > Allow polars-like table.agg(a_sum=<expr>) in addition to > > table.select(<expr>.alias(“a_sum”) for providing named aliases via kwargs > > > > > > 3) I am ok with removing this for now although I do wish there was an > > easier way to include some of the most common connector interfaces. > > > > - Zander > > > > On Sun, Jun 29, 2025 at 11:13 PM Dian Fu <dian0511...@gmail.com> wrote: > > > > > Hi Zander, > > > > > > Thanks for driving this effort! Big +1 overally. This will be a good > > > improvement for Python users. > > > > > > Some quick questions about this FLIP: > > > > > > 1) Low-Level Knobs: Certain low-level, non-Pythonic configuration > > > options may be deprecated or hidden to simplify the API surface. > > > > > > Could you give some examples on which configuration options do you mean? > > > > > > 2) Introduction of user-friendly methods on the Table object for data > > > preview, such as .show() and .display(), similar to those in other > > > data-frame libraries. > > > > > > Dataframe style APIs are widely adopted in the Python world. However, > > > there are many convenient APIs in the DataFrame, I guess it deserves a > > > separate FLIP to discuss which kinds of API we want to borrow from it. > > > > > > 3) Package top connectors (Kafka, Parquet, S3) to reduce friction from > > > manual JAR downloads. > > > > > > I'm not sure if this is feasible since the connector implementations > > > have been moved to separate repos. However, I agree that the > > > experience should be improved. Maybe we could provide some guides to > > > improve the experience. > > > > > > Regards, > > > Dian > > > > > > > > > > > > > > > On Sat, Jun 28, 2025 at 5:24 AM Alexander Matheson > > > <a.w.mathe...@gmail.com> wrote: > > > > > > > > Hi devs, > > > > > > > > I would like to start a discussion about a new FLIP for a rather large > > > > umbrella of work concerning PyFlink that Dian Fu, Xingbo Huang, myself > > > and > > > > others have been coordinating around. > > > > > > > > As PyFlink continues to grow in adoption (downloads are up 10x YoY on > > > > PyPI!!!) it is overdue for additional investment to bring it inline with > > > > the expectations of the Python community. Given the increase in AI > > > > workloads shifting to real-time, these improvements will also help to > > > > support the net-new Flink users coming from that space. > > > > > > > > The Project is called The Zen of Flink as an ode to the driving > > > principles > > > > of Python called the Zen of Python and is broadly about making Flink > > > > more > > > > Pythonic. The work falls into six categories across API design, > > > > documentation, debuggability, local development, integration with the > > > > ecosystem and general usability. Not all of the work is concretely > > > > scoped > > > > yet and is not planned to be as more improvements will arise as we work > > > on > > > > this effort. > > > > > > > > The details of the FLIP can be found in the google doc linked below. > > > > > > > > > > > https://docs.google.com/document/d/18_u1XA9C_zdY_fu1OtQDwYyIk_TwjUfzAUOzhGxbN6w/edit?usp=sharing > > > > > > > > Looking forward to the discussion. > > > > > > > > Best, > > > > > > > > Zander > > >