Re: Hive query execution from Spark(through HiveContext) failing with Apache Sentry

Ajay Wed, 17 Jun 2015 15:12:09 -0700

Hi there!

It seems like you have Read/Execute access permission (and no 
update/insert/delete access).  What operation are you performing?


Ajay

> On Jun 17, 2015, at 5:24 PM, nitinkak001 <[email protected]> wrote:
> 
> I am trying to run a hive query from Spark code using HiveContext object. It
> was running fine earlier but since the Apache Sentry has been set installed
> the process is failing with this exception :
> 
> /org.apache.hadoop.security.AccessControlException: Permission denied:
> user=kakn, access=READ_EXECUTE,
> inode="/user/hive/warehouse/rt_freewheel_mastering.db/digital_profile_cluster_in":hive:hive:drwxrwx--t/
> hive-site.xml
> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n23381/hive-site.xml>
>   
> I have pasted the full stack trace at the end of this post. My username
> "kakn" is a registered user with Sentry. I know that Spark takes all the
> configurations from hive-site.xml to execute the hql, so I added a few
> Sentry specific properties but seem to have no effect.It seems that the
> HiveContext is not going through HiveServer2(which I understand talks to
> Sentry component for user translation/delegation).  I have attached the
> hive-site.xml.
> 
> /<property>
>    <name>hive.security.authorization.task.factory</name>
> 
> <value>org.apache.sentry.binding.hive.SentryHiveAuthorizationTaskFactoryImpl</value>
>  </property>
>  <property>
>    <name>hive.metastore.pre.event.listeners</name>
>    <value>org.apache.sentry.binding.metastore.MetastoreAuthzBinding</value>
>    <description>list of comma seperated listeners for metastore
> events.</description>
>  </property>
>  <property>
>    <name>hive.warehouse.subdir.inherit.perms</name>
>    <value>true</value>
> </property>/
> 
> 
> 
> 
> /org.apache.hadoop.security.AccessControlException: Permission denied:
> user=kakn, access=READ_EXECUTE,
> inode="/user/hive/warehouse/rt_freewheel_mastering.db/digital_profile_cluster_in":hive:hive:drwxrwx--t
>    at
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
>    at
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
>    at
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:151)
>    at
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
>    at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6287)
>    at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6269)
>    at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6194)
>    at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:4793)
>    at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4755)
>    at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:800)
>    at
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getListing(AuthorizationProviderProxyClientProtocol.java:310)
>    at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:606)
>    at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>    at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
>    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
>    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>    at java.security.AccessController.doPrivileged(Native Method)
>    at javax.security.auth.Subject.doAs(Subject.java:415)
>    at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> 
>    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>    at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>    at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>    at
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>    at
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>    at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1895)
>    at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1876)
>    at
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:654)
>    at
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:104)
>    at
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:716)
>    at
> org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712)
>    at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>    at
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:712)
>    at
> org.apache.spark.sql.parquet.ParquetTypesConverter$.readMetaData(ParquetTypes.scala:440)
>    at
> org.apache.spark.sql.parquet.ParquetTypesConverter$.readSchemaFromFile(ParquetTypes.scala:477)
>    at
> org.apache.spark.sql.parquet.ParquetRelation.<init>(ParquetRelation.scala:65)
>    at org.apache.spark.sql.SQLContext.parquetFile(SQLContext.scala:165)
>    at
> org.apache.spark.sql.hive.HiveStrategies$ParquetConversion$.apply(HiveStrategies.scala:149)
>    at
> org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
>    at
> org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
>    at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
>    at
> org.apache.spark.sql.catalyst.planning.QueryPlanner.apply(QueryPlanner.scala:59)
>    at
> org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:418)
>    at
> org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:416)
>    at
> org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:422)
>    at
> org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:422)
>    at
> org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
>    at
> org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
>    at org.apache.spark.sql.SchemaRDD.getDependencies(SchemaRDD.scala:127)
>    at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:192)
>    at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:190)
>    at scala.Option.getOrElse(Option.scala:120)
>    at org.apache.spark.rdd.RDD.dependencies(RDD.scala:190)
>    at org.apache.spark.rdd.RDD.firstParent(RDD.scala:1239)
>    at org.apache.spark.sql.SchemaRDD.getPartitions(SchemaRDD.scala:122)
>    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
>    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:203)
>    at scala.Option.getOrElse(Option.scala:120)
>    at org.apache.spark.rdd.RDD.partitions(RDD.scala:203)
>    at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
>    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
>    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:203)
>    at scala.Option.getOrElse(Option.scala:120)
>    at org.apache.spark.rdd.RDD.partitions(RDD.scala:203)
>    at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:79)
>    at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:80)
>    at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:192)
>    at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:190)
>    at scala.Option.getOrElse(Option.scala:120)
>    at org.apache.spark.rdd.RDD.dependencies(RDD.scala:190)
>    at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:301)
>    at
> org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:313)
>    at org.apache.spark.scheduler.DAGScheduler.newStage(DAGScheduler.scala:247)
>    at
> org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:734)
>    at
> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1389)
>    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>    at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>    at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>    at
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>    at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>    at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
> Permission denied: user=kakn, access=READ_EXECUTE,
> inode="/user/hive/warehouse/rt_freewheel_mastering.db/digital_profile_cluster_in":hive:hive:drwxrwx--t
>    at
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
>    at
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
>    at
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:151)
>    at
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
>    at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6287)
>    at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6269)
>    at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6194)
>    at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:4793)
>    at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4755)
>    at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:800)
>    at
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getListing(AuthorizationProviderProxyClientProtocol.java:310)
>    at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:606)
>    at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>    at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
>    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
>    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>    at java.security.AccessController.doPrivileged(Native Method)
>    at javax.security.auth.Subject.doAs(Subject.java:415)
>    at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> 
>    at org.apache.hadoop.ipc.Client.call(Client.java:1411)
>    at org.apache.hadoop.ipc.Client.call(Client.java:1364)
>    at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>    at com.sun.proxy.$Proxy14.getListing(Unknown Source)
>    at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:546)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>    at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>    at java.lang.reflect.Method.invoke(Method.java:606)
>    at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>    at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>    at com.sun.proxy.$Proxy15.getListing(Unknown Source)
>    at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1893)/
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Hive-query-execution-from-Spark-through-HiveContext-failing-with-Apache-Sentry-tp23381.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Hive query execution from Spark(through HiveContext) failing with Apache Sentry

Reply via email to