No I got closer and discovered that my problem is related with permissions.
In example
drwxr-xr-x - margusja hdfs 0 2016-05-12 03:33 /tmp/files_10k
...
-rw-r--r-- 3 margusja hdfs 5 2016-05-12 02:01
/tmp/files_10k/f1959.txt
-rw-r--r-- 3 margusja hdfs 4 2016-05-12 02:01
/tmp/files_10k/f196.txt
-rw-r--r-- 3 margusja hdfs 5 2016-05-12 02:01
/tmp/files_10k/f1960.txt
-rw-r--r-- 3 margusja hdfs 5 2016-05-12 02:01
/tmp/files_10k/f1961.txt
-rw-r--r-- 3 margusja hdfs 5 2016-05-12 02:01
/tmp/files_10k/f1962.txt
-rw-r--r-- 3 margusja hdfs 5 2016-05-12 02:01
/tmp/files_10k/f1963.txt
-rw-r--r-- 3 margusja hdfs 5 2016-05-12 02:01
/tmp/files_10k/f1964.txt
-rw-r--r-- 3 margusja hdfs 5 2016-05-12 02:01
/tmp/files_10k/f1965.txt
-rw-r--r-- 3 margusja hdfs 5 2016-05-12 02:01
/tmp/files_10k/f1966.txt
...
Connected to: Apache Hive (version 1.2.1.2.3.4.0-3485)
Driver: Hive JDBC (version 1.2.1.2.3.4.0-3485)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://bigdata29.webmedia.int:10000/> create external table
files_10k (i int) row format delimited fields terminated by '\t'
location '/tmp/files_10k';
No rows affected (3.184 seconds)
0: jdbc:hive2://bigdata29.webmedia.int:10000/>
Now I change owner to flume in example.
drwxr-xr-x - flume hdfs 0 2016-05-12 03:33 /tmp/files_10k
...
-rw-r--r-- 3 flume hdfs 5 2016-05-12 02:01
/tmp/files_10k/f1968.txt
-rw-r--r-- 3 flume hdfs 5 2016-05-12 02:01
/tmp/files_10k/f1969.txt
-rw-r--r-- 3 flume hdfs 4 2016-05-12 02:01
/tmp/files_10k/f197.txt
-rw-r--r-- 3 flume hdfs 5 2016-05-12 02:01
/tmp/files_10k/f1970.txt
-rw-r--r-- 3 flume hdfs 5 2016-05-12 02:01
/tmp/files_10k/f1971.txt
-rw-r--r-- 3 flume hdfs 5 2016-05-12 02:01
/tmp/files_10k/f1972.txt
...
Others can read. In example user margusja can read
[margusja@bigdata29 ~]$ hdfs dfs -ls /tmp/files_10k
Found 1112 items
-rw-r--r-- 3 flume hdfs 2 2016-05-12 01:59 /tmp/files_10k/f1.txt
-rw-r--r-- 3 flume hdfs 3 2016-05-12 01:59 /tmp/files_10k/f10.txt
-rw-r--r-- 3 flume hdfs 4 2016-05-12 01:59
/tmp/files_10k/f100.txt
-rw-r--r-- 3 flume hdfs 5 2016-05-12 01:59
/tmp/files_10k/f1000.txt
-rw-r--r-- 3 flume hdfs 6 2016-05-12 01:59
/tmp/files_10k/f10000.txt
I try now create a table
0: jdbc:hive2://bigdata29.webmedia.int:10000/> create external table
files_10k (i int) row format delimited fields terminated by '\t'
location '/tmp/files_10k';
Error: Error while compiling statement: FAILED:
HiveAccessControlException Permission denied: user [margusja] does not
have [READ] privilege on [hdfs://mycluster/tmp/files_10k]
(state=42000,code=40000)
0: jdbc:hive2://bigdata29.webmedia.int:10000/>
In Hiveserver2.log:
2016-05-12 03:38:58,111 INFO [HiveServer2-Handler-Pool: Thread-69]:
parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command:
create external table files_10k (i int) row format delimited fields
terminated by '\t' location '/tmp/files_10k'
2016-05-12 03:38:58,112 INFO [HiveServer2-Handler-Pool: Thread-69]:
parse.ParseDriver (ParseDriver.java:parse(209)) - Parse Completed
2016-05-12 03:38:58,112 INFO [HiveServer2-Handler-Pool: Thread-69]:
log.PerfLogger (PerfLogger.java:PerfLogEnd(162)) - </PERFLOG
method=parse start=1463038738111 end=1463038738112 duration=1
from=org.apache.hadoop.hive.ql.Driver>
2016-05-12 03:38:58,112 INFO [HiveServer2-Handler-Pool: Thread-69]:
log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) - <PERFLOG
method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
2016-05-12 03:38:58,112 INFO [HiveServer2-Handler-Pool: Thread-69]:
parse.CalcitePlanner (SemanticAnalyzer.java:analyzeInternal(10114)) -
Starting Semantic Analysis
2016-05-12 03:38:58,113 INFO [HiveServer2-Handler-Pool: Thread-69]:
parse.CalcitePlanner (SemanticAnalyzer.java:analyzeCreateTable(10776)) -
Creating table default.files_10k position=22
2016-05-12 03:38:58,113 INFO [HiveServer2-Handler-Pool: Thread-69]:
metastore.HiveMetaStore (HiveMetaStore.java:logInfo(747)) - 1:
get_database: default
2016-05-12 03:38:58,113 INFO [HiveServer2-Handler-Pool: Thread-69]:
HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(372)) -
ugi=hive/bigdata29.webmedia....@testhadoop.com ip=unknown-ip-addr
cmd=get_database: default
2016-05-12 03:38:58,118 INFO [HiveServer2-Handler-Pool: Thread-69]:
ql.Driver (Driver.java:compile(466)) - Semantic Analysis Completed
2016-05-12 03:38:58,118 INFO [HiveServer2-Handler-Pool: Thread-69]:
log.PerfLogger (PerfLogger.java:PerfLogEnd(162)) - </PERFLOG
method=semanticAnalyze start=1463038738112 end=1463038738118 duration=6
from=org.apache.hadoop.hive.ql.Driver>
2016-05-12 03:38:58,118 INFO [HiveServer2-Handler-Pool: Thread-69]:
ql.Driver (Driver.java:getSchema(246)) - Returning Hive schema:
Schema(fieldSchemas:null, properties:null)
2016-05-12 03:38:58,118 INFO [HiveServer2-Handler-Pool: Thread-69]:
log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) - <PERFLOG
method=doAuthorization from=org.apache.hadoop.hive.ql.Driver>
2016-05-12 03:39:00,148 INFO
[org.apache.hadoop.util.JvmPauseMonitor$Monitor@53bb71e5]:
util.JvmPauseMonitor (JvmPauseMonitor.java:run(195)) - Detected pause in
JVM or host machine (eg GC): pause of approximately 1916ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=2002ms
2016-05-12 03:39:01,733 INFO
[org.apache.hadoop.util.JvmPauseMonitor$Monitor@53bb71e5]:
util.JvmPauseMonitor (JvmPauseMonitor.java:run(195)) - Detected pause in
JVM or host machine (eg GC): pause of approximately 1081ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=1455ms
2016-05-12 03:39:20,984 ERROR [HiveServer2-Handler-Pool: Thread-69]:
authorizer.RangerHiveAuthorizer
(RangerHiveAuthorizer.java:isURIAccessAllowed(755)) - Error getting
permissions for hdfs://mycluster/tmp/files_10k
java.io.IOException: Couldn't create proxy provider class
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
I am confused. What extra rights Hive except?
Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 48 780
On 11/05/16 14:17, Margus Roo wrote:
One more example:
[hdfs@hadoopnn1 ~]$ hdfs dfs -count -h /user/margusja/files_10k/
1 9.8 K 47.7 K /user/margusja/files_10k
[hdfs@hadoopnn1 ~]$ hdfs dfs -count -h /datasource/dealgate/
53 7.9 K 8.5 G /datasource/dealgate
2: jdbc:hive2://hadoopnn1.estpak.ee:10000/def> create external table
files_10k (i int) row format delimited fields terminated by '\t'
location '/user/margusja/files_10k';
No rows affected (0.197 seconds)
2: jdbc:hive2://hadoopnn1.estpak.ee:10000/def> drop table files_10k;
No rows affected (0.078 seconds)
2: jdbc:hive2://hadoopnn1.estpak.ee:10000/def> create external table
files_10k (i int) row format delimited fields terminated by '\t'
location '/datasource/dealgate';
Error: org.apache.thrift.transport.TTransportException
(state=08S01,code=0)
2: jdbc:hive2://hadoopnn1.estpak.ee:10000/def>
So in my point of view beeline in some reason looks data and old hive
client does not.
Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 48 780
On 11/05/16 13:35, Margus Roo wrote:
More information:
2016-05-11 13:31:17,086 INFO [HiveServer2-Handler-Pool:
Thread-5867]: parse.ParseDriver (ParseDriver.java:parse(185)) -
Parsing command: create external table files_10k (i int) row format
delimited fields terminated by '\t' location '/user/margusja/files_10k'
2016-05-11 13:31:17,089 INFO [HiveServer2-Handler-Pool:
Thread-5867]: parse.ParseDriver (ParseDriver.java:parse(209)) - Parse
Completed
2016-05-11 13:31:17,089 INFO [HiveServer2-Handler-Pool:
Thread-5867]: log.PerfLogger (PerfLogger.java:PerfLogEnd(162)) -
</PERFLOG method=parse start=1462962677086 end=1462962677089
duration=3 from=org.apache.hadoop.hive.ql.Driver>
2016-05-11 13:31:17,089 INFO [HiveServer2-Handler-Pool:
Thread-5867]: log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) -
<PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
2016-05-11 13:31:17,090 INFO [HiveServer2-Handler-Pool:
Thread-5867]: parse.CalcitePlanner
(SemanticAnalyzer.java:analyzeInternal(10114)) - Starting Semantic
Analysis
2016-05-11 13:31:17,093 INFO [HiveServer2-Handler-Pool:
Thread-5867]: parse.CalcitePlanner
(SemanticAnalyzer.java:analyzeCreateTable(10776)) - Creating table
default.files_10k position=22
2016-05-11 13:31:17,094 INFO [HiveServer2-Handler-Pool:
Thread-5867]: metastore.HiveMetaStore
(HiveMetaStore.java:logInfo(747)) - 2: get_database: default
2016-05-11 13:31:17,094 INFO [HiveServer2-Handler-Pool:
Thread-5867]: HiveMetaStore.audit
(HiveMetaStore.java:logAuditEvent(372)) -
ugi=hive/hadoopnn1.estpak...@testhadoop.com ip=unknown-ip-addr
cmd=get_database: default
2016-05-11 13:31:17,098 WARN [HiveServer2-Handler-Pool:
Thread-5867]: security.UserGroupInformation
(UserGroupInformation.java:getGroupNames(1521)) - No groups available
for user hive
2016-05-11 13:31:17,098 WARN [HiveServer2-Handler-Pool:
Thread-5867]: security.UserGroupInformation
(UserGroupInformation.java:getGroupNames(1521)) - No groups available
for user hive
2016-05-11 13:31:17,099 WARN [HiveServer2-Handler-Pool:
Thread-5867]: security.UserGroupInformation
(UserGroupInformation.java:getGroupNames(1521)) - No groups available
for user hive
2016-05-11 13:31:17,099 WARN [HiveServer2-Handler-Pool:
Thread-5867]: security.UserGroupInformation
(UserGroupInformation.java:getGroupNames(1521)) - No groups available
for user hive
2016-05-11 13:31:17,099 INFO [HiveServer2-Handler-Pool:
Thread-5867]: metadata.HiveUtils
(HiveUtils.java:getMetaStoreAuthorizeProviderManagers(353)) - Adding
metastore authorization provider:
org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
2016-05-11 13:31:17,102 WARN [HiveServer2-Handler-Pool:
Thread-5867]: security.UserGroupInformation
(UserGroupInformation.java:getGroupNames(1521)) - No groups available
for user hive
2016-05-11 13:31:17,102 WARN [HiveServer2-Handler-Pool:
Thread-5867]: security.UserGroupInformation
(UserGroupInformation.java:getGroupNames(1521)) - No groups available
for user hive
2016-05-11 13:31:17,106 INFO [HiveServer2-Handler-Pool:
Thread-5867]: ql.Driver (Driver.java:compile(466)) - Semantic
Analysis Completed
2016-05-11 13:31:17,106 INFO [HiveServer2-Handler-Pool:
Thread-5867]: log.PerfLogger (PerfLogger.java:PerfLogEnd(162)) -
</PERFLOG method=semanticAnalyze start=1462962677089
end=1462962677106 duration=17 from=org.apache.hadoop.hive.ql.Driver>
2016-05-11 13:31:17,106 INFO [HiveServer2-Handler-Pool:
Thread-5867]: ql.Driver (Driver.java:getSchema(246)) - Returning Hive
schema: Schema(fieldSchemas:null, properties:null)
2016-05-11 13:31:17,106 INFO [HiveServer2-Handler-Pool:
Thread-5867]: log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) -
<PERFLOG method=doAuthorization from=org.apache.hadoop.hive.ql.Driver>
2016-05-11 13:31:17,107 WARN [HiveServer2-Handler-Pool:
Thread-5867]: security.UserGroupInformation
(UserGroupInformation.java:getGroupNames(1521)) - No groups available
for user margusja
2016-05-11 13:31:18,289 INFO
[org.apache.hadoop.util.JvmPauseMonitor$Monitor@59f45950]:
util.JvmPauseMonitor (JvmPauseMonitor.java:run(195)) - Detected pause
in JVM or host machine (eg GC): pause of approximately 1092ms
2016-05-11 13:31:29,547 INFO [HiveServer2-Handler-Pool:
Thread-5867]: retry.RetryInvocationHandler
(RetryInvocationHandler.java:invoke(144)) - Exception while invoking
getListing of class ClientNamenodeProtocolTranslatorPB over
hadoopnn1.estpak.ee/88.196.164.42:8020. Trying to fail over immediately.
java.io.IOException: com.google.protobuf.ServiceException:
java.lang.OutOfMemoryError: GC overhead limit exceeded
at
org.apache.hadoop.ipc.ProtobufHelper.getRemoteException(ProtobufHelper.java:47)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:580)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy16.getListing(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2094)
at
org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2077)
at
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:832)
at
org.apache.hadoop.hdfs.DistributedFileSystem.access$700(DistributedFileSystem.java:106)
at
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:863)
at
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:859)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:859)
at
org.apache.hadoop.hive.common.FileUtils.isOwnerOfFileHierarchy(FileUtils.java:481)
at
org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.isURIAccessAllowed(RangerHiveAuthorizer.java:749)
at
org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.checkPrivileges(RangerHiveAuthorizer.java:252)
at
org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.isURIAccessAllowed(RangerHiveAuthorizer.java:749)
at
org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.checkPrivileges(RangerHiveAuthorizer.java:252)
at
org.apache.hadoop.hive.ql.Driver.doAuthorizationV2(Driver.java:817)
at
org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:608)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:499)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:314)
at
org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1164)
at
org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1158)
at
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
at
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181)
at
org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:410)
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:397)
at
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:274)
at
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
at
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
at
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
at
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at
org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.google.protobuf.ServiceException:
java.lang.OutOfMemoryError: GC overhead limit exceeded
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:271)
at com.sun.proxy.$Proxy15.getListing(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:573)
... 39 more
I have hdfs namenode high availability configured and automatic fail
over enabled. I can see that active namenode does not change during
the creating table process.
Also I have hive high availability configured.
Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 48 780
On 11/05/16 12:26, Margus Roo wrote:
Sadly in our environment:
Generated files like you did.
Connected to: Apache Hive (version 1.2.1.2.3.4.0-3485)
Driver: Hive JDBC (version 1.2.1.2.3.4.0-3485)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://hadoopnn1.estpak.ee:2181,hado> create external table
files_10k (i int) row format delimited fields terminated by '\t'
location '/user/margusja/files_10k';
Error: Shutdown in progress, cannot remove a shutdownHook
(state=,code=0)
0: jdbc:hive2://hadoopnn1.estpak.ee:2181,hado>
Using just hive:
[margusja@hadoopnn1 ~]$ hive
WARNING: Use "yarn jar" to launch YARN applications.
log4j:WARN No such property [maxBackupIndex] in
org.apache.log4j.DailyRollingFileAppender.
Logging initialized using configuration in
file:/etc/hive/2.3.4.0-3485/0/hive-log4j.properties
hive> create external table files_10k (i int) row format delimited
fields terminated by '\t' location '/user/margusja/files_10k';
OK
Time taken: 1.255 seconds
hive>
Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 48 780
On 11/05/16 10:16, Markovitz, Dudu wrote:
create external table files_10k (i int) row format delimited fields
terminated by '\t' location '/tmp/files_10k';