Hi Juan,

Have you tried Flink release built with Hadoop 2.7 or later version?
If you are using Flink 1.8/1.9, it should be Pre-bundled Hadoop 2.7+ jar
which can be found in the Flink download page.

I think YARN-3103 is about AMRMClientImp.class and it is in the flink
shaded hadoop jar.

Thanks,
Zhu Zhu

Juan Gentile <j.gent...@criteo.com> 于2019年8月23日周五 下午7:48写道:

> Hello!
>
>
>
> We are running Flink on Yarn and we are currently getting the following
> error:
>
>
>
> *2019-08-23 06:11:01,534 WARN
> org.apache.hadoop.security.UserGroupInformation               -
> PriviledgedActionException as:XXXX (auth:KERBEROS)
> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
> Invalid AMRMToken from appattempt_1564713228886_5299648_000001*
>
> *2019-08-23 06:11:01,535 WARN
> org.apache.hadoop.ipc.Client                                  - Exception
> encountered while connecting to the server :
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
> Invalid AMRMToken from appattempt_1564713228886_5299648_000001*
>
> *2019-08-23 06:11:01,536 WARN
> org.apache.hadoop.security.UserGroupInformation               -
> PriviledgedActionException as: XXXX (auth:KERBEROS)
> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
> Invalid AMRMToken from appattempt_1564713228886_5299648_000001*
>
> *2019-08-23 06:11:01,581 WARN
> org.apache.hadoop.io.retry.RetryInvocationHandler             - Exception
> while invoking ApplicationMasterProtocolPBClientImpl.allocate over rm0. Not
> retrying because Invalid or Cancelled Token*
>
> *org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid
> AMRMToken from appattempt_1564713228886_5299648_000001*
>
> *    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)*
>
> *    at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)*
>
> *    at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)*
>
> *    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)*
>
> *    at
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)*
>
> *    at
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)*
>
> *    at
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)*
>
> *    at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source)*
>
> *    at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*
>
> *    at java.lang.reflect.Method.invoke(Method.java:498)*
>
> *    at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:288)*
>
> *    at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:206)*
>
> *    at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:188)*
>
> *    at com.sun.proxy.$Proxy26.allocate(Unknown Source)*
>
> *    at
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:277)*
>
> *    at
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:224)*
>
> *Caused by:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
> Invalid AMRMToken from appattempt_1564713228886_5299648_000001*
>
> *    at org.apache.hadoop.ipc.Client.call(Client.java:1472)*
>
> *    at org.apache.hadoop.ipc.Client.call(Client.java:1409)*
>
> *    at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)*
>
> *    at com.sun.proxy.$Proxy25.allocate(Unknown Source)*
>
> *    at
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)*
>
> *    ... 9 more*
>
>
>
> The Flink cluster runs ok for a while but then after a day we get this
> error again. We haven’t made changes to our code so that’s why it’s hard to
> understand why all of a sudden we started to see this.
>
>
>
> We found this issue reported on Yarn
> https://issues.apache.org/jira/browse/YARN-3103 but our version of Yarn
> already has that fix.
>
>
>
> Any help will be appreciated.
>
>
>
> Thank you,
>
> Juan
>

Reply via email to