Thanks for sharing the information.

I also observed the same, S3 (Primary Checkpoint Storage) + EBS (Task Local
Recovery) performs better than EBS as Primary Checkpoint storage.



On Tue, Jul 18, 2023 at 12:21 PM Konstantin Knauf <kna...@apache.org> wrote:

> Hi Prabhu,
>
> this should be possible, but is quite expensive in comparison to AWS S3
> and you have to remount the EBS volumes to the new Taskmanagers in case of
> a failure which takes some non-trivial time, which slows down recovery. So,
> overall I don't think its peferrable compared to S3.
>
> We do use EBS volumes, though, for the local RocksDB working directory. We
> don't remount them on failure though right now due to the additional
> latency that is introduced by that.
>
> Cheers,
>
> Konstantin
>
> Am Mi., 12. Juli 2023 um 18:55 Uhr schrieb Prabhu Joseph <
> prabhujose.ga...@gmail.com>:
>
>> Hi,
>>
>> We are investigating the feasibility of setting up an Elastic Block Store
>> (EBS) as checkpoint storage by mounting a volume (a shared local file
>> system path) to JobManager and all the TaskManager pods. I want to hear any
>> feedback on this approach if anyone has already tried it.
>>
>>
>> Thanks,
>> Prabhu Joseph
>>
>
>
> --
> https://twitter.com/snntrable
> https://github.com/knaufk
>

Reply via email to