Re: Confusing S3 Entropy Injection Behavior

2022-05-19 Thread Aeden Jameson
Thanks for the response David. I'm using Flink 1.13.5. >> For point 1 the behavior you are seeing is what is expected. Great. That's what I concluded after digging into things a little more. This helps me be sure I just didn't miss some other configuration. Thank you. >> For point 2, I'm not sur

Re: Confusing S3 Entropy Injection Behavior

2022-05-19 Thread David Anderson
Aeden, I want to expand my answer after having re-read your question a bit more carefully. For point 1 the behavior you are seeing is what is expected. With hadoop the metadata written by the job manager will literally include "_entropy_" in its path, while this will be replaced in paths of any a

Re: Confusing S3 Entropy Injection Behavior

2022-05-19 Thread David Anderson
This sounds like it could be FLINK-17359 [1]. What version of Flink are you using? Another likely explanation arises from the fact that only the checkpoint data files (the ones created and written by the task managers) will have the _entropy_ replaced. The job manager does not inject entropy into

Confusing S3 Entropy Injection Behavior

2022-05-18 Thread Aeden Jameson
I have checkpoints setup against s3 using the hadoop plugin. (I'll migrate to presto at some point) I've setup entropy injection per the documentation with state.checkpoints.dir: s3://my-bucket/_entropy_/my-job/checkpoints s3.entropy.key: _entropy_ I'm seeing some behavior that I don't quite unde