Re: Checkpointing on Google Cloud Dataflow Runner

2022-08-30 Thread Reuven Lax via user
Snapshots are expected to happen nearly instantaneously. While processing is paused while the snapshot is in progress, the pause should usually be very brief. It's true that Dataflow does not support automated snapshots - you would have to create them yourself using a cron. Checkpoints on Flink ar

Re: Checkpointing on Google Cloud Dataflow Runner

2022-08-30 Thread Will Baker
I looked into snapshots and they do seem useful for providing a means to save state and resume, however they aren't as seamless as I was hoping for with the automatic checkpointing that is supported by other runners. It looked like snapshots would be user initiated and would pause the pipeline whil

Re: Checkpointing on Google Cloud Dataflow Runner

2022-08-29 Thread Reuven Lax via user
Google Cloud Dataflow does support snapshots . Is this what you were looking for? On Mon, Aug 29, 2022 at 4:04 PM Kenneth Knowles wrote: > Hi Will, David, > > I think you'll find the best source of answer for this sort of question on

Re: Checkpointing on Google Cloud Dataflow Runner

2022-08-29 Thread Kenneth Knowles
Hi Will, David, I think you'll find the best source of answer for this sort of question on the user@beam list. I've put that in the To: line with a BCC: to the dev@beam list so everyone knows they can find the thread there. If I have misunderstood, and your question has to do with building Beam it