Github user bolkedebruin commented on the pull request:
https://github.com/apache/spark/pull/7489#issuecomment-122700827
Ok. I have tested the the same on a CDH 5.4.2 cluster and I see a
difference in 1) Spark 1.3.0 (bundled with 5.4.2) and 2) Spark 1.5.0-SNAPSHOT
and 3) HDP vs CDH .
1) Spark 1.3.0 does not connect to the resource manager but to the
scheduler (that runs on port 8030) instead:
```
15/07/19 21:42:00 INFO YarnRMClient: Registering the ApplicationMaster
15/07/19 21:42:00 DEBUG Client: The ping interval is 60000 ms.
15/07/19 21:42:00 DEBUG Client: Connecting to
master01.paymentslab.int/172.17.12.10:8030
15/07/19 21:42:00 DEBUG UserGroupInformation: PrivilegedAction
as:bolkedebruin (auth:SIMPLE)
from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
15/07/19 21:42:00 DEBUG SaslRpcClient: Sending sasl message state: NEGOTIATE
15/07/19 21:42:00 DEBUG SaslRpcClient: Received SASL message state:
NEGOTIATE
auths {
method: "TOKEN"
mechanism: "DIGEST-MD5"
protocol: ""
serverId: "default"
challenge:
"realm=\"default\",nonce=\"P6hVNMbIZZ+KtpdxwsktwDpkortSjhlXdI1heRHb\",qop=\"auth\",charset=utf-8,algorithm=md5-sess"
}
15/07/19 21:42:00 DEBUG SaslRpcClient: Get token info proto:interface
org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB
info:org.apache.hadoop.yarn.security.SchedulerSecurityInfo$1@3d362683
15/07/19 21:42:00 DEBUG AMRMTokenSelector: Looking for a token with service
172.17.12.10:8030
15/07/19 21:42:00 DEBUG AMRMTokenSelector: Token kind is
HDFS_DELEGATION_TOKEN and the token's service name is 172.17.12.10:8020
15/07/19 21:42:00 DEBUG AMRMTokenSelector: Token kind is YARN_AM_RM_TOKEN
and the token's service name is 172.17.12.10:8030
15/07/19 21:42:00 DEBUG SaslRpcClient: Creating SASL DIGEST-MD5(TOKEN)
client to authenticate to service at default
15/07/19 21:42:00 DEBUG SaslRpcClient: Use TOKEN authentication for
protocol ApplicationMasterProtocolPB
15/07/19 21:42:00 DEBUG SaslRpcClient: SASL client callback: setting
username: AAABTqfFTXUAAAAFAAAAARKwkeU=
15/07/19 21:42:00 DEBUG SaslRpcClient: SASL client callback: setting
userPassword
15/07/19 21:42:00 DEBUG SaslRpcClient: SASL client callback: setting realm:
default
15/07/19 21:42:00 DEBUG SaslRpcClient: Sending sasl message state: INITIATE
token:
"charset=utf-8,username=\"AAABTqfFTXUAAAAFAAAAARKwkeU=\",realm=\"default\",nonce=\"P6hVNMbIZZ+KtpdxwsktwDpkortSjhlXdI1heRHb\",nc=00000001,cnonce=\"rL0eXrixoIFyuiPaGRUGeYwFWiPbGv8JcMIqHrAV\",digest-uri=\"/default\",maxbuf=65536,response=c00d228ec16b5fc9e0a4bab4f906c249,qop=auth"
auths {
method: "TOKEN"
mechanism: "DIGEST-MD5"
protocol: ""
serverId: "default"
}
15/07/19 21:42:00 DEBUG SaslRpcClient: Received SASL message state: SUCCESS
token: "rspauth=9f9908f9b225fd633c9efe57caa5f09c"
```
2) *Spark 1.5.0-SNAPSHOT without my patch*
```
15/07/19 21:56:30 DEBUG AbstractService: Service
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl is started
15/07/19 21:56:30 DEBUG Client: The ping interval is 60000 ms.
15/07/19 21:56:30 DEBUG Client: Connecting to
master01.paymentslab.int/172.17.12.10:8032
15/07/19 21:56:30 DEBUG UserGroupInformation: PrivilegedAction
as:bolkedebruin (auth:SIMPLE)
from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
15/07/19 21:56:30 DEBUG SaslRpcClient: Sending sasl message state: NEGOTIATE
15/07/19 21:56:30 DEBUG SaslRpcClient: Received SASL message state:
NEGOTIATE
auths {
method: "TOKEN"
mechanism: "DIGEST-MD5"
protocol: ""
serverId: "default"
challenge:
"realm=\"default\",nonce=\"TcfchRLxjw/FLx4eooDgeKHp+Oqh4D5I/e/b39oC\",qop=\"auth\",charset=utf-8,algorithm=md5-sess"
}
auths {
method: "KERBEROS"
mechanism: "GSSAPI"
protocol: "yarn"
serverId: "master01.paymentslab.int"
}
15/07/19 21:56:30 DEBUG SaslRpcClient: Get token info proto:interface
org.apache.hadoop.yarn.api.ApplicationClientProtocolPB
info:org.apache.hadoop.yarn.security.client.ClientRMSecurityInfo$2@7e758e43
15/07/19 21:56:30 DEBUG RMDelegationTokenSelector: Looking for a token with
service 172.17.12.10:8032
15/07/19 21:56:30 DEBUG RMDelegationTokenSelector: Token kind is
HDFS_DELEGATION_TOKEN and the token's service name is 172.17.12.10:8020
15/07/19 21:56:30 DEBUG RMDelegationTokenSelector: Token kind is
YARN_AM_RM_TOKEN and the token's service name is
15/07/19 21:56:30 DEBUG UserGroupInformation: PrivilegedActionException
as:bolkedebruin (auth:SIMPLE)
cause:org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]
15/07/19 21:56:30 DEBUG UserGroupInformation: PrivilegedAction
as:bolkedebruin (auth:SIMPLE)
from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
15/07/19 21:56:30 WARN Client: Exception encountered while connecting to
the server : org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]
15/07/19 21:56:30 DEBUG UserGroupInformation: PrivilegedActionException
as:bolkedebruin (auth:SIMPLE) cause:java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot authenticate
via:[TOKEN, KERBEROS]
15/07/19 21:56:30 DEBUG Client: closing ipc connection to
master01.paymentslab.int/172.17.12.10:8032:
org.apache.hadoop.security.AccessControlException: Client cannot authenticate
via:[TOKEN, KERBEROS]
java.io.IOException: org.apache.hadoop.security.AccessControlException:
Client cannot authenticate via:[TOKEN, KERBEROS]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy21.getClusterNodes(Unknown Source)
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy22.getClusterNodes(Unknown Source)
at
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:73)
at scala.Option.foreach(Option.scala:236)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend.getDriverLogUrls(YarnClusterSchedulerBackend.scala:73)
at
org.apache.spark.SparkContext.postApplicationStart(SparkContext.scala:1993)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:544)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:28)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:516)
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]
at
org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172)
at
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
at
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553)
at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
... 28 more
15/07/19 21:56:30 DEBUG Client: IPC Client (1657081124) connection to
master01.paymentslab.int/172.17.12.10:8032 from bolkedebruin: closed
15/07/19 21:56:30 INFO YarnClusterSchedulerBackend: Node Report API is not
available in the version of YARN being used, so AM logs link will not appear in
application UI
```
So it does have the same error only here it is considered non fatal
*SPARK 1.5.0-SNAPSHOT with my patch*
```
5/07/19 21:47:37 DEBUG Client: The ping interval is 60000 ms.
15/07/19 21:47:37 DEBUG Client: Connecting to
master01.paymentslab.int/172.17.12.10:8032
15/07/19 21:47:37 DEBUG UserGroupInformation: PrivilegedAction
as:bolkedebruin (auth:SIMPLE)
from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
15/07/19 21:47:37 DEBUG SaslRpcClient: Sending sasl message state: NEGOTIATE
15/07/19 21:47:37 DEBUG SaslRpcClient: Received SASL message state:
NEGOTIATE
auths {
method: "TOKEN"
mechanism: "DIGEST-MD5"
protocol: ""
serverId: "default"
challenge:
"realm=\"default\",nonce=\"cU+ygZlbzCbqdsTml1q7BDO3FzcAJOseZPtYKkml\",qop=\"auth\",charset=utf-8,algorithm=md5-sess"
}
auths {
method: "KERBEROS"
mechanism: "GSSAPI"
protocol: "yarn"
serverId: "master01.paymentslab.int"
}
15/07/19 21:47:37 DEBUG SaslRpcClient: Get token info proto:interface
org.apache.hadoop.yarn.api.ApplicationClientProtocolPB
info:org.apache.hadoop.yarn.security.client.ClientRMSecurityInfo$2@39cc736c
15/07/19 21:47:37 DEBUG RMDelegationTokenSelector: Looking for a token with
service 172.17.12.10:8032
15/07/19 21:47:37 DEBUG RMDelegationTokenSelector: Token kind is
HDFS_DELEGATION_TOKEN and the token's service name is 172.17.12.10:8020
15/07/19 21:47:37 DEBUG RMDelegationTokenSelector: Token kind is
YARN_AM_RM_TOKEN and the token's service name is
15/07/19 21:47:37 DEBUG RMDelegationTokenSelector: Token kind is
RM_DELEGATION_TOKEN and the token's service name is 172.17.12.10:8032
15/07/19 21:47:37 DEBUG SaslRpcClient: Creating SASL DIGEST-MD5(TOKEN)
client to authenticate to service at default
15/07/19 21:47:37 DEBUG SaslRpcClient: Use TOKEN authentication for
protocol ApplicationClientProtocolPB
15/07/19 21:47:37 DEBUG SaslRpcClient: SASL client callback: setting
username:
ABxib2xrZWRlYnJ1aW5AUEFZTUVOVFNMQUIuSU5UBHlhcm4AigFOp9tNVYoBTsvn0VUCAg==
15/07/19 21:47:37 DEBUG SaslRpcClient: SASL client callback: setting
userPassword
15/07/19 21:47:37 DEBUG SaslRpcClient: SASL client callback: setting realm:
default
15/07/19 21:47:37 DEBUG SaslRpcClient: Sending sasl message state: INITIATE
token:
"charset=utf-8,username=\"ABxib2xrZWRlYnJ1aW5AUEFZTUVOVFNMQUIuSU5UBHlhcm4AigFOp9tNVYoBTsvn0VUCAg==\",realm=\"default\",nonce=\"cU+ygZlbzCbqdsTml1q7BDO3FzcAJOseZPtYKkml\",nc=00000001,cnonce=\"xWU7TjKq9IKtci8lG185kDi4t9r9jUcM9ADW6PJY\",digest-uri=\"/default\",maxbuf=65536,response=81dc2419495d5c5c3886f031a54a78ea,qop=auth"
auths {
method: "TOKEN"
mechanism: "DIGEST-MD5"
protocol: ""
serverId: "default"
}
15/07/19 21:47:37 DEBUG SaslRpcClient: Received SASL message state: SUCCESS
token: "rspauth=b84e94b9d514c0ea602ba59f4394adfe"
15/07/19 21:47:37 DEBUG Client: Negotiated QOP is :auth
15/07/19 21:47:37 DEBUG Client: IPC Client (1586183723) connection to
master01.paymentslab.int/172.17.12.10:8032 from bolkedebruin: starting, having
connections 1
15/07/19 21:47:37 DEBUG Client: IPC Client (1586183723) connection to
master01.paymentslab.int/172.17.12.10:8032 from bolkedebruin sending #0
15/07/19 21:47:37 DEBUG Client: IPC Client (1586183723) connection to
master01.paymentslab.int/172.17.12.10:8032 from bolkedebruin got value #0
15/07/19 21:47:37 DEBUG ProtobufRpcEngine: Call: getClusterNodes took 157ms
```
3) On HDP we found the error to be fatal in CDH not for some reason.
Conclusion (imho): The RM delegation token is not included in any of the
tested Spark versions and in all occasions there is an error in not having the
RM delegation token. The consequences are different across versions it seems.
My patch fixes these errors and I think it should be considered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]