Re: Challenges Deploying Flink With Savepoints On Kubernetes

Aleksandar Mastilovic Wed, 25 Sep 2019 13:12:47 -0700

Would you guys (Flink devs) be interested in our solution for zookeeper-less 
HA? I could ask the managers how they feel about open-sourcing the improvement.


> On Sep 25, 2019, at 11:49 AM, Yun Tang <myas...@live.com> wrote:
> 
> As Aleksandar said, k8s with HA configuration could solve your problem. There 
> already have some discussion about how to implement such HA in k8s if we 
> don't have a zookeeper service: FLINK-11105 [1] and FLINK-12884 [2]. 
> Currently, you might only have to choose zookeeper as high-availability 
> service.
> 
> [1] https://issues.apache.org/jira/browse/FLINK-11105 
> <https://issues.apache.org/jira/browse/FLINK-11105>
> [2] https://issues.apache.org/jira/browse/FLINK-12884 
> <https://issues.apache.org/jira/browse/FLINK-12884>
> 
> Best
> Yun Tang
> From: Aleksandar Mastilovic <amastilo...@sightmachine.com>
> Sent: Thursday, September 26, 2019 1:57
> To: Sean Hester <sean.hes...@bettercloud.com>
> Cc: Hao Sun <ha...@zendesk.com>; Yuval Itzchakov <yuva...@gmail.com>; user 
> <user@flink.apache.org>
> Subject: Re: Challenges Deploying Flink With Savepoints On Kubernetes
>  
> Can’t you simply use JobManager in HA mode? It would pick up where it left 
> off if you don’t provide a Savepoint.
> 
>> On Sep 25, 2019, at 6:07 AM, Sean Hester <sean.hes...@bettercloud.com 
>> <mailto:sean.hes...@bettercloud.com>> wrote:
>> 
>> thanks for all replies! i'll definitely take a look at the Flink k8s 
>> Operator project.
>> 
>> i'll try to restate the issue to clarify. this issue is specific to starting 
>> a job from a savepoint in job-cluster mode. in these cases the Job Manager 
>> container is configured to run a single Flink job at start-up. the savepoint 
>> needs to be provided as an argument to the entrypoint. the Flink 
>> documentation for this approach is here:
>> 
>> https://github.com/apache/flink/tree/master/flink-container/kubernetes#resuming-from-a-savepoint
>>  
>> <https://github.com/apache/flink/tree/master/flink-container/kubernetes#resuming-from-a-savepoint>
>> 
>> the issue is that taking this approach means that the job will always start 
>> from the savepoint provided as the start argument in the Kubernetes YAML. 
>> this includes unplanned restarts of the job manager, but we'd really prefer 
>> any unplanned restarts resume for the most recent checkpoint instead of 
>> restarting from the configured savepoint. so in a sense we want the 
>> savepoint argument to be transient, only being used during the initial 
>> deployment, but this runs counter to the design of Kubernetes which always 
>> wants to restore a deployment to the "goal state" as defined in the YAML.
>> 
>> i hope this helps. if you want more details please let me know, and thanks 
>> again for your time.
>> 
>> 
>> On Tue, Sep 24, 2019 at 1:09 PM Hao Sun <ha...@zendesk.com 
>> <mailto:ha...@zendesk.com>> wrote:
>> I think I overlooked it. Good point. I am using Redis to save the path to my 
>> savepoint, I might be able to set a TTL to avoid such issue.
>> 
>> Hao Sun
>> 
>> 
>> On Tue, Sep 24, 2019 at 9:54 AM Yuval Itzchakov <yuva...@gmail.com 
>> <mailto:yuva...@gmail.com>> wrote:
>> Hi Hao,
>> 
>> I think he's exactly talking about the usecase where the JM/TM restart and 
>> they come back up from the latest savepoint which might be stale by that 
>> time.
>> 
>> On Tue, 24 Sep 2019, 19:24 Hao Sun, <ha...@zendesk.com 
>> <mailto:ha...@zendesk.com>> wrote:
>> We always make a savepoint before we shutdown the job-cluster. So the 
>> savepoint is always the latest. When we fix a bug or change the job graph, 
>> it can resume well.
>> We only use checkpoints for unplanned downtime, e.g. K8S killed JM/TM, 
>> uncaught exception, etc.
>> 
>> Maybe I do not understand your use case well, I do not see a need to start 
>> from checkpoint after a bug fix.
>> From what I know, currently you can use checkpoint as a savepoint as well
>> 
>> Hao Sun
>> 
>> 
>> On Tue, Sep 24, 2019 at 7:48 AM Yuval Itzchakov <yuva...@gmail.com 
>> <mailto:yuva...@gmail.com>> wrote:
>> AFAIK there's currently nothing implemented to solve this problem, but 
>> working on a possible fix can be implemented on top of 
>> https://github.com/lyft/flinkk8soperator 
>> <https://github.com/lyft/flinkk8soperator> which already has a pretty fancy 
>> state machine for rolling upgrades. I'd love to be involved as this is an 
>> issue I've been thinking about as well.
>> 
>> Yuval
>> 
>> On Tue, Sep 24, 2019 at 5:02 PM Sean Hester <sean.hes...@bettercloud.com 
>> <mailto:sean.hes...@bettercloud.com>> wrote:
>> hi all--we've run into a gap (knowledge? design? tbd?) for our use cases 
>> when deploying Flink jobs to start from savepoints using the job-cluster 
>> mode in Kubernetes.
>> 
>> we're running a ~15 different jobs, all in job-cluster mode, using a mix of 
>> Flink 1.8.1 and 1.9.0, under GKE (Google Kubernetes Engine). these are all 
>> long-running streaming jobs, all essentially acting as microservices. we're 
>> using Helm charts to configure all of our deployments.
>> 
>> we have a number of use cases where we want to restart jobs from a savepoint 
>> to replay recent events, i.e. when we've enhanced the job logic or fixed a 
>> bug. but after the deployment we want to have the job resume it's 
>> "long-running" behavior, where any unplanned restarts resume from the latest 
>> checkpoint.
>> 
>> the issue we run into is that any obvious/standard/idiomatic Kubernetes 
>> deployment includes the savepoint argument in the configuration. if the Job 
>> Manager container(s) have an unplanned restart, when they come back up they 
>> will start from the savepoint instead of resuming from the latest 
>> checkpoint. everything is working as configured, but that's not exactly what 
>> we want. we want the savepoint argument to be transient somehow (only used 
>> during the initial deployment), but Kubernetes doesn't really support the 
>> concept of transient configuration.
>> 
>> i can see a couple of potential solutions that either involve custom code in 
>> the jobs or custom logic in the container (i.e. a custom entrypoint script 
>> that records that the configured savepoint has already been used in a file 
>> on a persistent volume or GCS, and potentially when/why/by which 
>> deployment). but these seem like unexpected and hacky solutions. before we 
>> head down that road i wanted to ask:
>> is this is already a solved problem that i've missed?
>> is this issue already on the community's radar?
>> thanks in advance!
>> 
>> -- 
>> Sean Hester | Senior Staff Software Engineer | m. 404-828-0865 <>
>> 3525 Piedmont Rd. NE, Building 6, Suite 500, Atlanta, GA 30305 
>>  <http://www.bettercloud.com/> <http://www.bettercloud.com/>
>> Altitude 2019 in San Francisco | Sept. 23 - 25
>> It’s not just an IT conference, it’s “a complete learning and networking 
>> experience” 
>> <https://altitude.bettercloud.com/?utm_source=gmail&utm_medium=signature&utm_campaign=2019-altitude>
>> 
>> 
>> 
>> -- 
>> Best Regards,
>> Yuval Itzchakov.
>> 
>> 
>> -- 
>> Sean Hester | Senior Staff Software Engineer | m. 404-828-0865 <>
>> 3525 Piedmont Rd. NE, Building 6, Suite 500, Atlanta, GA 30305 
>>  <http://www.bettercloud.com/> <http://www.bettercloud.com/>
>> Altitude 2019 in San Francisco | Sept. 23 - 25
>> It’s not just an IT conference, it’s “a complete learning and networking 
>> experience” 
>> <https://altitude.bettercloud.com/?utm_source=gmail&utm_medium=signature&utm_campaign=2019-altitude>

Re: Challenges Deploying Flink With Savepoints On Kubernetes

Reply via email to