Re: Sometimes checkpoints to s3 fail

2022-10-14 Thread Matthias Pohl via user
Hi Evgeniy, is it Ceph which you're using as a S3 server? All the Google search entries point to Ceph when looking for the error message. Could it be that there's a problem with the version of the underlying system? The stacktrace you provided looks like Flink struggles to close the File and, there

Sometimes checkpoints to s3 fail

2022-10-06 Thread Evgeniy Lyutikov
Hello all. I can’t understand the floating problem, sometimes checkpoints stop passing, sometimes they start to complete every other time. Flink 1.14.4 in kubernetes application mode. 2022-10-06 09:08:04,731 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Triggering che