[ https://issues.apache.org/jira/browse/FLINK-10256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16601838#comment-16601838 ]
陈梓立 commented on FLINK-10256: ----------------------------- Hi guys, I met a problem when porting JM failover case to FLIP-6 codebase. On legacy codebase, we build a MiniCluster directly on actor system, while on FLIP-6 codebase, Flink use a wrapped RpcService to communicate with. In such case, if we call stopService, it acts a poststop for the RpcEndpoint. But for jm failover, we would prefer a "force shutdown" like process killed or actor poisonpilled. I think it cases on JobMaster failover are important since they are common failures, and it is good to port them to FILP-6 codebase. Thus I wonder how could I simulate such a case with FILP-6 MiniCluster? cc [~till.rohrmann] > Port legacy jobmanager test to FILP-6 > ------------------------------------- > > Key: FLINK-10256 > URL: https://issues.apache.org/jira/browse/FLINK-10256 > Project: Flink > Issue Type: Improvement > Components: Tests > Affects Versions: 1.7.0 > Reporter: 陈梓立 > Assignee: 陈梓立 > Priority: Major > Fix For: 1.7.0 > > > I am planning to rework JobManagerFailsITCase and JobManagerTest into > JobMasterITCase and JobMasterHAITCase. That is, reorganize the legacy tests, > make them neat and cover cases explicitly. The PR would follow before this > weekend. > While reworking, I'd like to add more jm failover test cases list below, for > the further implement of jm failover with RECONCILING state. For "jm > failover", I mean a real world failover(like low power or process exit), > without calling Flink internal postStop logic or something like it. > 1. Streaming task with jm failover. > 2. Streaming task with jm failover concurrent to task fail. > 3. Batch task with jm failover. > 4. Batch task with jm failover concurrent to task fail. > 5. Batch task with jm failover when some vertex has already been FINISHED. -- This message was sent by Atlassian JIRA (v7.6.3#76005)