Hi Alexey, Thanks for the feedback. You are right. StatefulSet + PersistentVolume + FileSystemHaService could be another bundle of services for Flink HA support on K8s. The user jars could be built into the image or downloaded by init-container or mount via the PV. So they do not need to be recovered from HA storage. But i think the checkpoint path and counter should be persisted so that we could recover from the latest checkpoint.
Best, Yang Alexey Trenikhun <yen...@msn.com> 于2020年9月2日周三 上午7:36写道: > Judging from FLIP-19 (thank you Roman for the link), of 3 use cases (jars, > RPC messages and log files) only jar files need HA guarantee, and in my > particular case, job cluster with jar as part of image, it seems doesn't > matter, I guess it explains why in my test I was able to recover from > failure even VoidBlobStore. I also use StatefulSet instead of Deployment > > Thanks, > Alexey > > ------------------------------ > *From:* Yang Wang <danrtsey...@gmail.com> > *Sent:* Tuesday, September 1, 2020 1:58 AM > *To:* dev <d...@flink.apache.org> > *Cc:* Alexey Trenikhun <yen...@msn.com>; Flink User Mail List < > user@flink.apache.org> > *Subject:* Re: FileSystemHaServices and BlobStore > > Hi Alexey, > > Glad to hear that your are interested the K8s HA support. > > Roman's answer is just on point. > > "FileSystemBlobStore" is trying to store the user jars, job graph, etc. on > the distributed storage(e.g. HDFS, S3, GFS). So when the > JobManager failover, it could fetch the blob data from remote storage. It > is very important for standalone and Yarn deployment since > the local blob store is ephemeral and will be cleaned up after JobManager > terminated. > > However, in your case, benefit from the K8s persistent volume, all the > local blob data could be recovered after JobManager pod restarted. > Then you could find that the jobs are recovered and keeps to running. > Please also remember that the checkpoint meta and counter also > need to be stored in local file. After then the Flink job could recover > from the latest checkpoint successfully. > > > About the FileSystemHaService > I am a little skeptical about this feature. Currently, we are using K8s > deployment for the JobManager. And it is not always guaranteed only > one JobManager is running. For example, the kubelet is down and never be > pulled up again. I am trying to work on another ticket "native K8s HA"[1], > in which we will get a fully functional HA service, including leader > election/retrieval, jobgraph meta store, checkpoint meta store, running > registry, etc. > It could be used for standalone K8s and native K8s deployment. > > > [1]. https://issues.apache.org/jira/browse/FLINK-12884 > > Best, > Yang > > > Khachatryan Roman <khachatryan.ro...@gmail.com> 于2020年8月31日周一 下午8:52写道: > > + dev > > Blob store is used for jars, serialized job, and task information and logs. > You can find some information at > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture > > > I guess in your setup, Flink was able to pick up local files. > HA setup presumes that Flink can survive the loss of that JM host and its > local files. > > I'm not sure about K8s native setup - probably VoidBlobStore is enough if > there is a persistent volume. > But in the general case, FileSystemBlobStore should be used to store files > on some DFS. > > > Regards, > Roman > > > On Sat, Aug 29, 2020 at 6:42 PM Alexey Trenikhun <yen...@msn.com> wrote: > > > Did test with streaming job and FileSystemHaService using VoidBlobStore > > (no HA Blob), looks like job was able to recover from both JM restart and > > TM restart. Any idea in what use cases HA Blob is needed? > > > > Thanks, > > Alexey > > ------------------------------ > > *From:* Alexey Trenikhun <yen...@msn.com> > > *Sent:* Friday, August 28, 2020 11:31 AM > > *To:* Khachatryan Roman <khachatryan.ro...@gmail.com> > > *Cc:* Flink User Mail List <user@flink.apache.org> > > *Subject:* Re: FileSystemHaServices and BlobStore > > > > Motivation is to have k8s HA setup without extra component - Zookeeper, > > see [1] > > > > Purpose of BlobStore is vague to me, what kind of BLOBs are stored? Looks > > like if we start job from savepoint, then persistence of BlobStore is > > not necessary, but is it needed if we recover from checkpoint? > > > > Thanks, > > Alexey > > > > [1]. https://issues.apache.org/jira/browse/FLINK-17598 > > > > > > ------------------------------ > > *From:* Khachatryan Roman <khachatryan.ro...@gmail.com> > > *Sent:* Friday, August 28, 2020 9:24 AM > > *To:* Alexey Trenikhun <yen...@msn.com> > > *Cc:* Flink User Mail List <user@flink.apache.org> > > *Subject:* Re: FileSystemHaServices and BlobStore > > > > Hello Alexey, > > > > I think you need FileSystemBlobStore as you are implementing HA Services, > > and BLOBs should be highly available too. > > However, I'm a bit concerned about the direction in general: it > > essentially means re-implementing ZK functionality on top of FS. > > What are the motivation and the use case? > > > > Regards, > > Roman > > > > > > On Fri, Aug 28, 2020 at 5:15 PM Alexey Trenikhun <yen...@msn.com> wrote: > > > > Hello, > > I'm thinking about implementing FileSystemHaServices - single leader, but > > persistent RunningJobRegistry, CheckpointIDCounter, > > CompletedCheckpointStore and JobGraphStore. I'm not sure do you need > > FileSystemBlobStore or VoidBlobStore is enough. Can't figure out, should > > BlobStore survive JobManager crash. I see that ZookeeperHaServices use > FileSystemBlobStore, > > but not clear is to due to having multiple JobManagers (leader + > follower) > > or necessity to preserve BLOBs on restart. > > > > Thanks, > > Alexey > > > > > >