Hey Ganesh, I took a look through the doc and left some notes. I was going to circle back to this feature in a bit, but I am happy to have someone more acquainted with Beam Java thinking about it too!
On the whole I think the doc is a good starting point, but a lot of pieces need to be fleshed out more. The core transform logic for the base class (e.g. how the provided client and request code is actually called) are the main concern, along with some consideration for batching. The client-side throttling piece is kind of its own monster that isn't wholly necessary for a POC remote inference transform, but is worth at least considering how it and the retry logic fit with the core class. I'm happy to iterate on this with you, I think we can get a really useful piece of code out of this with some work. Thanks, Jack McCluskey On Wed, Oct 8, 2025 at 5:43 AM Ganesh Sivakumar <[email protected]> wrote: > Hi everyone, I came across this github issue( > https://github.com/apache/beam/issues/36253) for developing a Java native > remote inference. > > To address it. I've put together a short design doc that talks about a new > RemoteInference transform for java sdk to perform inference with models > provider APIs like Vertex ai, open ai, anthropic and other model endpoints. > I would like to kindly request feedback and suggestions on transform design > and internals. > > Document - > https://docs.google.com/document/d/1b_JbiZoMJbjqi5ae0jnTaBr2gvvzbnWfxCRPvNdICVY/edit?usp=sharing > > Thanks, > Ganesh. >
