Re: GSSException when submitting Spark job in yarn-cluster mode with HiveContext APIs on Kerberos cluster

Olivier Girardot Mon, 22 Jun 2015 10:02:54 -0700

Hi,
I can't get this to work using CDH 5.4, Spark 1.4.0 in yarn cluster mode.
@andrew did you manage to get it work with the latest version ?


Le mar. 21 avr. 2015 à 00:02, Andrew Lee <alee...@hotmail.com> a écrit :

> Hi Marcelo,
>
> Exactly what I need to track, thanks for the JIRA pointer.
>
>
> > Date: Mon, 20 Apr 2015 14:03:55 -0700
> > Subject: Re: GSSException when submitting Spark job in yarn-cluster mode
> with HiveContext APIs on Kerberos cluster
> > From: van...@cloudera.com
> > To: alee...@hotmail.com
> > CC: user@spark.apache.org
>
> >
> > I think you want to take a look at:
> > https://issues.apache.org/jira/browse/SPARK-6207
> >
> > On Mon, Apr 20, 2015 at 1:58 PM, Andrew Lee <alee...@hotmail.com> wrote:
> > > Hi All,
> > >
> > > Affected version: spark 1.2.1 / 1.2.2 / 1.3-rc1
> > >
> > > Posting this problem to user group first to see if someone is
> encountering
> > > the same problem.
> > >
> > > When submitting spark jobs that invokes HiveContext APIs on a Kerberos
> > > Hadoop + YARN (2.4.1) cluster,
> > > I'm getting this error.
> > >
> > > javax.security.sasl.SaslException: GSS initiate failed [Caused by
> > > GSSException: No valid credentials provided (Mechanism level: Failed
> to find
> > > any Kerberos tgt)]
> > >
> > > Apparently, the Kerberos ticket is not on the remote data node nor
> computing
> > > node since we don't
> > > deploy Kerberos tickets, and that is not a good practice either. On the
> > > other hand, we can't just SSH to every machine and run kinit for that
> users.
> > > This is not practical and it is insecure.
> > >
> > > The point here is that shouldn't there be a delegation token during
> the doAs
> > > to use the token instead of the ticket ?
> > > I'm trying to understand what is missing in Spark's HiveContext API
> while a
> > > normal MapReduce job that invokes Hive APIs will work, but not in
> Spark SQL.
> > > Any insights or feedback are appreciated.
> > >
> > > Anyone got this running without pre-deploying (pre-initializing) all
> tickets
> > > node by node? Is this worth filing a JIRA?
> > >
> > >
> > >
> > > 15/03/25 18:59:08 INFO hive.metastore: Trying to connect to metastore
> with
> > > URI thrift://alee-cluster.test.testserver.com:9083
> > > 15/03/25 18:59:08 ERROR transport.TSaslTransport: SASL negotiation
> failure
> > > javax.security.sasl.SaslException: GSS initiate failed [Caused by
> > > GSSException: No valid credentials provided (Mechanism level: Failed
> to find
> > > any Kerberos tgt)]
> > > at
> > >
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
> > > at
> > >
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
> > > at
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
> > > at
> > >
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
> > > at
> > >
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
> > > at
> > >
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
> > > at java.security.AccessController.doPrivileged(Native Method)
> > > at javax.security.auth.Subject.doAs(Subject.java:415)
> > > at
> > >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> > > at
> > >
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
> > > at
> > >
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:336)
> > > at
> > >
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:214)
> > > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
> > > at
> > >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> > > at
> > >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> > > at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> > > at
> > >
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
> > > at
> > >
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
> > > at
> > >
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
> > > at
> > >
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
> > > at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
> > > at
> > >
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
> > > at
> > >
> org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.scala:235)
> > > at
> > >
> org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.scala:231)
> > > at scala.Option.orElse(Option.scala:257)
> > > at
> > >
> org.apache.spark.sql.hive.HiveContext.x$3$lzycompute(HiveContext.scala:231)
> > > at org.apache.spark.sql.hive.HiveContext.x$3(HiveContext.scala:229)
> > > at
> > >
> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:229)
> > > at
> org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:229)
> > > at
> > >
> org.apache.spark.sql.hive.HiveMetastoreCatalog.<init>(HiveMetastoreCatalog.scala:55)
> > > at
> > >
> org.apache.spark.sql.hive.HiveContext$$anon$2.<init>(HiveContext.scala:253)
> > > at
> > >
> org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:253)
> > > at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:253)
> > > at
> > >
> org.apache.spark.sql.hive.HiveContext$$anon$4.<init>(HiveContext.scala:263)
> > > at
> > >
> org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:263)
> > > at
> org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:262)
> > > at
> > >
> org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:411)
> > > at
> > >
> org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:411)
> > > at
> org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
> > > at org.apache.spark.sql.SchemaRDD.<init>(SchemaRDD.scala:108)
> > > at org.apache.spark.sql.hive.HiveContext.hiveql(HiveContext.scala:102)
> > > at org.apache.spark.sql.hive.HiveContext.hql(HiveContext.scala:106)
> > > at
> > >
> SparkSQLTestCase2HiveContextYarnClusterApp$.main(sparksql_hivecontext_examples_yarncluster.scala:17)
> > > at
> > >
> SparkSQLTestCase2HiveContextYarnClusterApp.main(sparksql_hivecontext_examples_yarncluster.scala)
> > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > at
> > >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > > at
> > >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > at java.lang.reflect.Method.invoke(Method.java:606)
> > > at
> > >
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:441)
> > > Caused by: GSSException: No valid credentials provided (Mechanism
> level:
> > > Failed to find any Kerberos tgt)
> > > at
> > >
> sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
> > > at
> > >
> sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
> > > at
> > >
> sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
> > > at
> > >
> sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
> > > at
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
> > > at
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
> > > at
> > >
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
> > > ... 48 more
> > >
> > >
> > >
> > >
> >
> >
> >
> > --
> > Marcelo
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> > For additional commands, e-mail: user-h...@spark.apache.org
> >
>

Re: GSSException when submitting Spark job in yarn-cluster mode with HiveContext APIs on Kerberos cluster

Reply via email to