Re: [DISCUSS] FLIP-531: Initiate Flink Agents as a new Sub-Project

Xintong Song Wed, 21 May 2025 19:54:22 -0700

Thanks everyone for the positive feedback.

As I said, this FLIP is intended for discussing high-level plans for the
new project. The project itself is still at an early stage, and some of the
technical designs and solutions are not completely ready yet. So atm I can
only share some personal thoughts on the raised questions, and we are open
to suggestions and opinions.


@Jing

1. Regarding MCP, I think it's just one way (and likely a major way) for
providing LLMs with context, but not the only way. E.g., a user may write a
dedicated python function and provide it to the LLM as a tool, which
doesn't necessarily need to go through the MCP protocol. At the same, the
LLM may discover more available tools from a MCP server. These are just 2
different sources that the tools come from, and they can co-exist.

2. In the long-term, yes, I think. As a first step, we probably will be
more focused on how to build individual agents, less on interactions across
multiple agents.  Not saying we won't support MAS in the first step, but
maybe not as complex as the A2A protocol.

3. Interactions between agents will be event-driven, so they are naturally
asynchronous. I'm not entirely sure about use cases that prefer
asynchronous agent calls. Could you share some examples?

4. I think I didn't fully get the taxonomy here. I mean why embedding vs.
workflow? From my understanding, I think Flink Agents should cover both use
cases.

5. Yes, memory is considered. Actually, Flink's state management makes a
good foundation for supporting agent memory.

@Nishita

1. I think calling an external LLM is similar to an async operator in
Flink, in terms of potential latency and backpressure issues. Flink's async
operator already supports concurrent async calls, rate control, timeout
handling, etc. But eventually, the bottleneck is at the external service
side, and we expect the model techniques will keep improving, with larger
throughput, less latency, and better stability.

2. Good question. I think real-time event-driven processing is somehow in
conflict with asynchronous human-in-the-loop feedback. One idea is that,
I've seen people doing this way, to build another agent for validating
results and generating feedback. Another idea is to collect samples of
results for asynchronous human-in-the-loop validations. But these are just
rough ideas. I don't have sophisticated answers at the moment.

Best,

Xintong



On Thu, May 22, 2025 at 3:26 AM Yash Anand <yan...@confluent.io.invalid>
wrote:

> Thank you for the proposal—this initiative will make it much easier to
> build event-driven AI agents seamlessly.
>
> +1 for the proposed Flink Agents sub-project!
>
> On Wed, May 21, 2025 at 9:43 AM Mayank Juneja <mayankjunej...@gmail.com>
> wrote:
>
> > +1 on the FLIP. This is a solid step toward building an agentic offering
> > that really leans into Flink’s strengths, and builds on the momentum from
> > recent API improvements like FLIP-437 and the proposed FLIP-529.
> >
> > Also wanted to echo the point around agent memory. More advanced agentic
> > systems really benefit from both short-term and long-term memory. While
> > long-term memory can live in databases (including vector stores), having
> a
> > built-in abstraction for managing short-term memory would be super
> useful.
> > Doesn’t need to be in the MVP, but definitely worth considering for the
> > roadmap.
> > Best,
> > Mayank
> >
> >
> > On Wed, May 21, 2025 at 4:54 PM Lincoln Lee <lincoln.8...@gmail.com>
> > wrote:
> >
> > > +1 for the proposed flink agents sub-project!
> > >
> > > This aligns perfectly with flink's core strengths in real-time event
> > > processing and stateful computations.
> > >
> > > Thanks for driving this initiative and looking forward to the
> > > detailed technical designs.
> > >
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > >
> > > Hao Li <lihao3...@gmail.com> 于2025年5月21日周三 23:28写道：
> > >
> > > > Hi Xintong, Sean and Chris,
> > > >
> > > > Thanks for driving the initiative. Very exciting to bring AI Agent to
> > > Flink
> > > > to empower the streaming use cases.
> > > >
> > > > +1 to the FLIP.
> > > >
> > > > Thanks,
> > > > Hao
> > > >
> > > > On Wed, May 21, 2025 at 7:35 AM Nishita Pattanayak <
> > > > nishita.pattana...@gmail.com> wrote:
> > > >
> > > > > Hi Sean, Chris and Xintong. This seems to be a very exciting
> > > sub-project.
> > > > > +1 for "flink-agents" sub-project.
> > > > >
> > > > > I was going through the FLIP , and had some questions regarding the
> > > same:
> > > > > 1. How would the external model calls (e.g., OpenAI or internal
> LLMs)
> > > > > integrated into Flink tasks without introducing backpressure or
> > latency
> > > > > issues?
> > > > > In my experience, calling an external LLM has the following
> > > > > risks: Latency-sensitive (LLM inference can take hundreds of
> > > milliseconds
> > > > > to seconds), Flaky (network issues, rate limits) as well as it
> > > > > is Non-deterministic (with timeouts, retries, etc.). It would be
> > great
> > > to
> > > > > work/brainstorm on how we solve these issues.
> > > > > 2. In traditional agent workflows, user feedback often plays a key
> > role
> > > > in
> > > > > validating and improving agent outputs. In a continuous,
> long-running
> > > > > Flink-based agent system, where interactions might not be
> user-facing
> > > or
> > > > > synchronous, how do we incorporate human-in-the-loop feedback or
> > > > > correctness signals to validate and iteratively improve agent
> > behavior?
> > > > >
> > > > > This is a really exciting direction for the Flink ecosystem. The
> idea
> > > of
> > > > > building long-running, context-aware agents natively on Flink feels
> > > like
> > > > a
> > > > > natural evolution of stream processing. I'd love to see this mature
> > and
> > > > > would be excited to contribute in any way I can to help
> productionize
> > > and
> > > > > validate this in real-world use cases.
> > > > >
> > > > > On Wed, May 21, 2025 at 8:52 AM Xintong Song <
> tonysong...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi devs,
> > > > > >
> > > > > > Sean, Chris and I would like to start a discussion on FLIP-531
> [1],
> > > > about
> > > > > > introducing a new sub-project, Flink Agents.
> > > > > >
> > > > > > With the rise of agentic AI, we have identified great new
> > > opportunities
> > > > > for
> > > > > > Flink, particularly in the system-triggered agent scenarios. We
> > > believe
> > > > > the
> > > > > > future of AI agent applications is industrialized, where agents
> > will
> > > > not
> > > > > > only be triggered by users, but increasingly by systems as well.
> > > > Flink's
> > > > > > event capabilities in real-time distributed event processing,
> state
> > > > > > management and exact-once consistency fault tolerance make it
> > > > well-suited
> > > > > > as a framework for building such system-triggered agents.
> > > Furthermore,
> > > > > > system-triggered agents are often tightly coupled with data
> > > processing.
> > > > > > Flink's outstanding data processing capabilities allows seamless
> > > > > > integration between data and agentic processing. These
> capabilities
> > > > > > differentiate Flink from other agent frameworks with unique
> > > advantages
> > > > in
> > > > > > the context of system-triggered agents.
> > > > > >
> > > > > > We propose this effort as a sub-project of Apache Flink, with a
> > > > separate
> > > > > > code repository and lightweight developing process, for rapid
> > > iteration
> > > > > > during the early stage.
> > > > > >
> > > > > > Please note that this FLIP is focused on the high-level plans,
> > > > including
> > > > > > motivation, positioning, goals, roadmap, and operating model of
> the
> > > > > > project. Detailed technical design is out of the scope and will
> be
> > > > > > discussed during the rapid prototyping and iterations.
> > > > > >
> > > > > > For more details, please check the FLIP [1]. Looking forward to
> > your
> > > > > > feedback.
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Xintong
> > > > > >
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-531%3A+Initiate+Flink+Agents+as+a+new+Sub-Peoject
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > *Mayank Juneja*
> > Product Manager | Data Streaming and AI
> >
>

Re: [DISCUSS] FLIP-531: Initiate Flink Agents as a new Sub-Project

Reply via email to