I see, thanks for the info, Till.
Appreciated for your help.

Best regards
Rainie

On Thu, Mar 18, 2021 at 2:09 AM Till Rohrmann <trohrm...@apache.org> wrote:

> Hi Rainie,
>
> if I remember correctly (unfortunately I don't have a S3 deployment at hand
> to try it out), then in v1.9 you should find the data files for the
> checkpoint under s3a://{bucket
> name}/dev/checkpoints/_entropy_/{job_id}/chk-2230. A checkpoint consists of
> these data files and a metadata file which links the individual data files
> from the different operators together to a checkpoint. The metadata file
> should be stored under s3a://{bucket
> name}/dev/checkpoints/{job_id}/chk-2230 so that it is easily discoverable.
> If the data files are also contained in s3a://{bucket
> name}/dev/checkpoints/{job_id}/chk-2230, then there is some problem or the
> system did not properly use the entropy functionality.
>
> My suspicion is that with FLINK-5763 (this has been introduced with Flink
> 1.11) we moved the metadata file also under the entropy folder to make the
> checkpoints/savepoints self-contained and relocatable.
>
> Cheers,
> Till
>
> On Wed, Mar 17, 2021 at 10:14 PM Rainie Li <raini...@pinterest.com
> .invalid>
> wrote:
>
> > Thanks for checking, Till.
> >
> > I have a follow up question for #2, do you know why the same job cannot
> > show up at the entropy checkpoint in Version 1.9.
> > For example:
> > *When it's running in v1.11, checkpoint path is: *
> > s3a://{bucket name}/dev/checkpoints/_entropy_/{job_id}/chk-1537
> > *When it's running in v1.9, checkpoint path is: *
> > s3a://{bucket name}/dev/checkpoints/{job_id}/chk-2230
> >
> > Not sure which caused this inconsistency issue.
> > Thanks
> > Best regards
> > Rainie
> >
> > On Wed, Mar 17, 2021 at 6:38 AM Till Rohrmann <trohrm...@apache.org>
> > wrote:
> >
> > > Hi Rainie,
> > >
> > > 1. I think what you need to do is to look for the {job_id} in all the
> > > possible sub folders of the dev/checkpoints/ folder or you extract the
> > > entropy from the logs.
> > >
> > > 2. According to [1] entropy should only be used for the data files and
> > not
> > > for the metadata files. The idea was to keep the metadata path entropy
> > free
> > > in order to make it more easily discoverable. I can imagine that this
> > > changed with FLINK-5763 [2] which was added in Flink 1.11. This
> > effectively
> > > means that in order to make checkpoints/savepoints self contained we
> > needed
> > > to add the entropy also to the metadata file paths. Moreover, this also
> > > means that the entropy injection works for 1.9 and 1.11. I think it was
> > > introduced with Flink 1.6.2, 1.7.0 [3].
> > >
> > > [1]
> > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/filesystems/s3.html#entropy-injection-for-s3-file-systems
> > > [2] https://issues.apache.org/jira/browse/FLINK-5763
> > > [3] https://issues.apache.org/jira/browse/FLINK-9061
> > >
> > > Cheers,
> > > Till
> > >
> > > On Tue, Mar 16, 2021 at 7:03 PM Rainie Li <raini...@pinterest.com
> > .invalid>
> > > wrote:
> > >
> > > > Hi Flink Developers.
> > > >
> > > > We enabled entropy injection for s3, here is our setting on Yarn
> > Cluster.
> > > > s3.entropy.key: _entropy_
> > > > s3.entropy.length: 1
> > > > state.checkpoints.dir: 's3a://{bucket
> name}/dev/checkpoints/_entropy_'
> > > >
> > > > I have two questions:
> > > > 1. After enabling entropy, job's checkpoint path changed to:
> > > > *s3://{bucket name}/dev/checkpoints/_entropy_/{job_id}chk-607*
> > > > SInce we don't know which key is mapped to _entropy_
> > > > It cannot be used to relaunch flink jobs by running
> > > > *flink run -s **s3://{bucket
> > > > name}/dev/checkpoints/_entropy_/{job_id}chk-607*
> > > > If you also enabled entropy injection for s3, any suggestion how to
> > > recover
> > > > failed jobs using entropy checkpoints?
> > > >
> > > > 2.We added entropy settings on the Yarn cluster.
> > > > But we can only see flink jobs in version 1.11 shows the entropy
> > > checkpoint
> > > > path.
> > > > For flink jobs version 1.9, they are still using checkpoint paths
> > without
> > > > entropy like:
> > > > *s3://{bucket name}/dev/checkpoints/{job_id}/chk-607*
> > > > Is this path equal to s3://*{bucket name}*
> > > > */dev/checkpoints/_entropy_/{job_id}**chk-607?*
> > > > Does entropy work for v1.9? If so, why does v1.9 job show checkpoint
> > > paths
> > > > *without* entropy?
> > > >
> > > > Appreciated any suggestions.
> > > > Thanks
> > > > Best regards
> > > > Rainie
> > > >
> > >
> >
>

Reply via email to