[jira] [Commented] (FLINK-18733) Jobmanager cannot start in HA mode with Zookeeper

Leonid Ilyevsky (Jira) Wed, 29 Jul 2020 06:46:25 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-18733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167223#comment-17167223
 ]


Leonid Ilyevsky commented on FLINK-18733:
-----------------------------------------

Thanks [~trohrmann].

It seems the https://issues.apache.org/jira/browse/ZOOKEEPER-3590 was only 
about ability to set the property.

I guess, the actual problem was fixed before that. We don't need to know when 
the bug was introduced and when it was fixed, as long as it is good now. 

But please notify your team that there is potential problem when using the 
default version.

> Jobmanager cannot start in HA mode with Zookeeper
> -------------------------------------------------
>
>                 Key: FLINK-18733
>                 URL: https://issues.apache.org/jira/browse/FLINK-18733
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.11.1
>            Reporter: Leonid Ilyevsky
>            Priority: Major
>         Attachments: flink-conf.yaml, 
> flink-liquidnt-standalonesession-0-nj1dvloglab01.liquidnet.biz.log, 
> flink-liquidnt-taskexecutor-0-nj1dvloglab01.liquidnet.biz.log
>
>
> When configured in HA mode, the Jobmanager cannot start at all. First, it 
> issues warnings like this:
> {quote}{{2020-07-27 08:58:23,197 WARN 
> org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn [] - 
> Session 0x0 for server *nj1dvloglab01.liquidnet.biz/<unresolved>:2181*, 
> unexpected error, closing socket connection and attempting reconnect}}
>  {{java.lang.IllegalArgumentException: *Unable to canonicalize address* 
> nj1dvloglab01.liquidnet.biz/<unresolved>:2181 because it's not resolvable}}
>  {{ at 
> org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:65)
>  ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]}}
>  {{ at 
> org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:41)
>  ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]}}
>  {{ at 
> org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1001)
>  ~[flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]}}
>  {{ at 
> org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1060)
>  [flink-shaded-zookeeper-3.4.14.jar:3.4.14-11.0]}}
> {quote}
> After few attempts connecting to Zookeeper, it finally fails:
> {quote}2020-07-27 08:59:35,055 ERROR 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Fatal error 
> occurred in the cluster entrypoint.
>  org.apache.flink.util.FlinkException: Unhandled error in 
> ZooKeeperLeaderElectionService: Ensure path threw exception
>  at 
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService.unhandledError(ZooKeeperLeaderElectionService.java:430)
>  ~[flink-dist_2.12-1.11.1.jar:1.11.1]
> {quote}
>  
> The same HA configuration works fine for me in Flink 1.10.0.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-18733) Jobmanager cannot start in HA mode with Zookeeper

Reply via email to