subject:"Flink autoscaler with AWS ASG\: checkpoint access issue"

Re: Flink autoscaler with AWS ASG: checkpoint access issue

2024-05-20 Thread Chetas Joshi

Hello, After digging into the 403 issue a bit, I figured out that after the scale-up event, the flink-s3-fs-presto uses the node-profile instead of IRSA (Iam Role for Service Account) on some of the newly created TM pods. 1. Anyone else experienced this as well? 2. Verified that this is an issue

Flink autoscaler with AWS ASG: checkpoint access issue

2024-05-13 Thread Chetas Joshi

Hello, Set up I am running my Flink streaming jobs (upgradeMode = stateless) on an AWS EKS cluster. The node-type for the pods of the streaming jobs belongs to a node-group that has an AWS ASG (auto scaling group). The streaming jobs are the FlinkDeployments managed by the flink-k8s-operator (1.8