I see, thanks for the info, Till. Appreciated for your help. Best regards Rainie
On Thu, Mar 18, 2021 at 2:09 AM Till Rohrmann <trohrm...@apache.org> wrote: > Hi Rainie, > > if I remember correctly (unfortunately I don't have a S3 deployment at hand > to try it out), then in v1.9 you should find the data files for the > checkpoint under s3a://{bucket > name}/dev/checkpoints/_entropy_/{job_id}/chk-2230. A checkpoint consists of > these data files and a metadata file which links the individual data files > from the different operators together to a checkpoint. The metadata file > should be stored under s3a://{bucket > name}/dev/checkpoints/{job_id}/chk-2230 so that it is easily discoverable. > If the data files are also contained in s3a://{bucket > name}/dev/checkpoints/{job_id}/chk-2230, then there is some problem or the > system did not properly use the entropy functionality. > > My suspicion is that with FLINK-5763 (this has been introduced with Flink > 1.11) we moved the metadata file also under the entropy folder to make the > checkpoints/savepoints self-contained and relocatable. > > Cheers, > Till > > On Wed, Mar 17, 2021 at 10:14 PM Rainie Li <raini...@pinterest.com > .invalid> > wrote: > > > Thanks for checking, Till. > > > > I have a follow up question for #2, do you know why the same job cannot > > show up at the entropy checkpoint in Version 1.9. > > For example: > > *When it's running in v1.11, checkpoint path is: * > > s3a://{bucket name}/dev/checkpoints/_entropy_/{job_id}/chk-1537 > > *When it's running in v1.9, checkpoint path is: * > > s3a://{bucket name}/dev/checkpoints/{job_id}/chk-2230 > > > > Not sure which caused this inconsistency issue. > > Thanks > > Best regards > > Rainie > > > > On Wed, Mar 17, 2021 at 6:38 AM Till Rohrmann <trohrm...@apache.org> > > wrote: > > > > > Hi Rainie, > > > > > > 1. I think what you need to do is to look for the {job_id} in all the > > > possible sub folders of the dev/checkpoints/ folder or you extract the > > > entropy from the logs. > > > > > > 2. According to [1] entropy should only be used for the data files and > > not > > > for the metadata files. The idea was to keep the metadata path entropy > > free > > > in order to make it more easily discoverable. I can imagine that this > > > changed with FLINK-5763 [2] which was added in Flink 1.11. This > > effectively > > > means that in order to make checkpoints/savepoints self contained we > > needed > > > to add the entropy also to the metadata file paths. Moreover, this also > > > means that the entropy injection works for 1.9 and 1.11. I think it was > > > introduced with Flink 1.6.2, 1.7.0 [3]. > > > > > > [1] > > > > > > > > > https://ci.apache.org/projects/flink/flink-docs-stable/deployment/filesystems/s3.html#entropy-injection-for-s3-file-systems > > > [2] https://issues.apache.org/jira/browse/FLINK-5763 > > > [3] https://issues.apache.org/jira/browse/FLINK-9061 > > > > > > Cheers, > > > Till > > > > > > On Tue, Mar 16, 2021 at 7:03 PM Rainie Li <raini...@pinterest.com > > .invalid> > > > wrote: > > > > > > > Hi Flink Developers. > > > > > > > > We enabled entropy injection for s3, here is our setting on Yarn > > Cluster. > > > > s3.entropy.key: _entropy_ > > > > s3.entropy.length: 1 > > > > state.checkpoints.dir: 's3a://{bucket > name}/dev/checkpoints/_entropy_' > > > > > > > > I have two questions: > > > > 1. After enabling entropy, job's checkpoint path changed to: > > > > *s3://{bucket name}/dev/checkpoints/_entropy_/{job_id}chk-607* > > > > SInce we don't know which key is mapped to _entropy_ > > > > It cannot be used to relaunch flink jobs by running > > > > *flink run -s **s3://{bucket > > > > name}/dev/checkpoints/_entropy_/{job_id}chk-607* > > > > If you also enabled entropy injection for s3, any suggestion how to > > > recover > > > > failed jobs using entropy checkpoints? > > > > > > > > 2.We added entropy settings on the Yarn cluster. > > > > But we can only see flink jobs in version 1.11 shows the entropy > > > checkpoint > > > > path. > > > > For flink jobs version 1.9, they are still using checkpoint paths > > without > > > > entropy like: > > > > *s3://{bucket name}/dev/checkpoints/{job_id}/chk-607* > > > > Is this path equal to s3://*{bucket name}* > > > > */dev/checkpoints/_entropy_/{job_id}**chk-607?* > > > > Does entropy work for v1.9? If so, why does v1.9 job show checkpoint > > > paths > > > > *without* entropy? > > > > > > > > Appreciated any suggestions. > > > > Thanks > > > > Best regards > > > > Rainie > > > > > > > > > >