Re: Ensuring a task does not get executed concurrently

2023-06-12 Thread Robert Bradshaw via dev
If you absolutely cannot tolerate concurrency an external locking mechanism is required. While a distributed system often waits for a work item to fail before trying it, this is not always the case (e.g. backup workers may be scheduled and whoever finishes first is determined to be the successful a

Re: Ensuring a task does not get executed concurrently

2023-06-12 Thread Bruno Volpato via dev
Hi Stephan, I am not sure if this is the best way to achieve this, but I've seen parallelism being limited by using state / KV and limiting the number of keys. In your case, you could have the same key for both non concurrency-safe operations and when using state, the Beam model will guarantee tha

Ensuring a task does not get executed concurrently

2023-06-12 Thread Stephan Hoyer via dev
Can the Beam data model (specifically the Python SDK) support executing functions that are idempotent but not concurrency-safe? I am thinking of a task like setting up a database (or in my case, a Zarr store in Xarray-Beam ) where it is no