[previous didn't cc list, sorry for dupes] The classic connection pool pattern, where expensive connections are created relatively few times and used by lots of transient short-lived tasks, each of which borrows a connection from the pool and returns it when done, would still be usable here, but as Péter points out, you can't rely on a single global pool instance to protect the backend from overload... Each JVM in the job (and possibly each classloader) will have its own pool.
If you really need a global limit, the easiest thing I can think of, if the backend supports it, is to configure a specific user/role for the Flink job, configure the backend system to only allow a limited number of concurrent connections by that user/role, and (importantly!) make sure the client side reacts well to the no-connections-available condition, e.g. by shunting data into a sink where it can get picked up and reprocessed, or using an async operator... I'm not sure there's a great way to handle this: you're adding backpressure to the entire job, and/or introducing additional out-of-order processing, which some use cases are very unhappy with. If you can't rate-limit the backend directly, if it's something REST-like, you could try wrapping it in a rate-limiting reverse proxy service, or use a distributed token-bucket kind of architecture and have runtimes try to grab a token before opening a connection... It wouldn't help with the backpressure/out-of-order problems though. Stepping back a _little_: if these connections are ultimately being used as KV-like lookups for enrichment purposes, and your use case can tolerate stale values under heavy load, you can add transient, optimistic "near-line" caching using something like Caffeine to read-through on a miss. That won't do anything about maintaining global limits, but it can significantly reduce the pressure, depending on the hit rate, which depends on cache sizing, which depends on locality... 😅 HTH! -0xe1a On Thu, Mar 21, 2024 at 1:21 PM Péter Váry <peter.vary.apa...@gmail.com> wrote: > Hi Jacob, > > Flink jobs, tasks typically run on multiple nodes/servers. This means that > it is not possible to have a connection shared on job level. > > You can read about the architecture in more detail in the docs. [1] > > I hope this helps, > Péter > > [1] - > https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/concepts/flink-architecture/ > > On Thu, Mar 21, 2024, 13:10 Jacob Rollings <jacobrolling...@gmail.com> > wrote: > >> Hello, >> >> Is there a way in Flink to instantiate or open connections (to cache/db) >> at global level, so that it can be reused across many process functions >> rather than doing it in each operator's open()?Along with opening, also >> wanted to know if there is a way to close them at job level stop, such that >> they are closed at the very end after each operator close() method is >> complete. Basically the idea is to maintain a single instance at global >> level and close its session as a last step after each opertor close is >> complete. >> >> >> Regards, >> JB >> >