Hey Ganesh,

I took a look through the doc and left some notes. I was going to circle
back to this feature in a bit, but I am happy to have someone more
acquainted with Beam Java thinking about it too!

On the whole I think the doc is a good starting point, but a lot of pieces
need to be fleshed out more. The core transform logic for the base class
(e.g. how the provided client and request code is actually called) are the
main concern, along with some consideration for batching. The client-side
throttling piece is kind of its own monster that isn't wholly necessary for
a POC remote inference transform, but is worth at least considering how it
and the retry logic fit with the core class.

I'm happy to iterate on this with you, I think we can get a really useful
piece of code out of this with some work.

Thanks,

Jack McCluskey

On Wed, Oct 8, 2025 at 5:43 AM Ganesh Sivakumar <[email protected]>
wrote:

> Hi everyone, I came across this github issue(
> https://github.com/apache/beam/issues/36253) for developing a Java native
> remote inference.
>
> To address it. I've put together a short design doc that talks about a new
> RemoteInference transform for java sdk to perform inference with models
> provider APIs like Vertex ai, open ai, anthropic and other model endpoints.
> I would like to kindly request feedback and suggestions on transform design
> and internals.
>
> Document -
> https://docs.google.com/document/d/1b_JbiZoMJbjqi5ae0jnTaBr2gvvzbnWfxCRPvNdICVY/edit?usp=sharing
>
> Thanks,
> Ganesh.
>

Reply via email to