ChenSammi commented on PR #7262:
URL: https://github.com/apache/ozone/pull/7262#issuecomment-2771426868

   Thanks @symious for working on this.  I have roughly gone through the code, 
and tested it locally. Here are the findings. 
   a. transfer leader to listener code fails as expected, but the error message 
can be improved. If the new leader assigned is a listener, we can just return 
the failure. 
   ```
   bash-5.1$ ozone admin om transfer -n=om4
   INTERNAL_ERROR om2@group-D66704EFC61C refused to transfer leadership to peer 
om4 as it is not in conf: {index: 13, cur=peers:[om1|om1:9872, om3|om3:9872, 
om2|om2:9872]|listeners:[om4|om4:9872], old=null}
   ```
   b. should have robot test for listener OM
   c. need a document to explain how to configure listener OM, and how to 
bootstrap a new listener OM.  
   d. while listener OM is running, OM request will send to listener OM too. If 
we can skip the listener, that will reduce the rpc call retry and rpc latency 
will not be impacted by introducing listener OM. 
   ```
   2025-03-31 10:30:04,540 [main] DEBUG ipc.Client: Connecting to 
om1/<unresolved>:9862
   2025-03-31 10:30:04,540 [main] DEBUG ipc.Client: Setup connection to 
om1/<unresolved>:9862
   2025-03-31 10:30:04,543 [main] DEBUG ipc.Client: Failed to connect to 
server: om1/<unresolved>:9862: failovers (0) exceeded maximum allowed (0)
   java.net.UnknownHostException: Invalid host name: local host is: 
"om4/172.22.0.11"; destination host is: "om1":9862; 
java.net.UnknownHostException; For more details see:  
http://wiki.apache.org/hadoop/UnknownHost
        at 
java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
        at 
java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)
        at 
java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:961)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:889)
        at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:619)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:789)
        at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:364)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1649)
        at org.apache.hadoop.ipc.Client.call(Client.java:1473)
        at org.apache.hadoop.ipc.Client.call(Client.java:1426)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:250)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:132)
        at jdk.proxy2/jdk.proxy2.$Proxy20.submitRequest(Unknown Source)
        at 
java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
        at java.base/java.lang.reflect.Method.invoke(Method.java:580)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:437)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:170)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:162)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:100)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:366)
        at jdk.proxy2/jdk.proxy2.$Proxy20.submitRequest(Unknown Source)
        at 
org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransport.submitRequest(Hadoop3OmTransport.java:73)
        at 
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.submitRequest(OzoneManagerProtocolClientSideTranslatorPB.java:340)
        at 
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.getServiceInfo(OzoneManagerProtocolClientSideTranslatorPB.java:1880)
        at 
org.apache.hadoop.ozone.client.rpc.RpcClient.<init>(RpcClient.java:260)
        at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:264)
        at 
org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:131)
        at 
org.apache.hadoop.ozone.shell.OzoneAddress.createRpcClientFromServiceId(OzoneAddress.java:120)
        at 
org.apache.hadoop.ozone.shell.OzoneAddress.createClient(OzoneAddress.java:167)
        at org.apache.hadoop.ozone.shell.Handler.createClient(Handler.java:82)
        at org.apache.hadoop.ozone.shell.Handler.call(Handler.java:70)
        at org.apache.hadoop.ozone.shell.Handler.call(Handler.java:36)
        at picocli.CommandLine.executeUserObject(CommandLine.java:2041)
        at picocli.CommandLine.access$1500(CommandLine.java:148)
        at 
picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2453)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2415)
        at 
picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273)
        at picocli.CommandLine$RunLast.execute(CommandLine.java:2417)
        at picocli.CommandLine.execute(CommandLine.java:2170)
        at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:88)
        at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:79)
        at org.apache.hadoop.ozone.shell.Shell.lambda$run$0(Shell.java:100)
        at 
org.apache.hadoop.hdds.tracing.TracingUtil.executeInSpan(TracingUtil.java:182)
        at 
org.apache.hadoop.hdds.tracing.TracingUtil.executeInNewSpan(TracingUtil.java:147)
        at org.apache.hadoop.ozone.shell.Shell.run(Shell.java:100)
        at org.apache.hadoop.ozone.shell.OzoneShell.main(OzoneShell.java:49)
   Caused by: java.net.UnknownHostException
        at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:621)
        ... 42 more
   ``` 
   
   I would suggest to complete (a) and (b) in this patch. For (c) and (d), I'm 
fine with either implementing in this patch or file follow up JIRAs to do that. 
 If new follow up JIRAs is referred, then I would suggest turn this HDDS-11523 
into an umbrella JIRA, have have sub task JIRAs for this feature. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to