is this location correct and valid? LOCATION '/data/SentimentFiles/*SentimentFiles*/upload/data/tweets_raw/'
Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 31 May 2016 at 08:50, Sandeep Giri <sand...@cloudxlab.com> wrote: > Hi Hive Team, > > As per my understanding, in Hive, you can create two kinds of tables: > Managed and External. > > In case of managed table, you own the data and hence when you drop the > table the data is deleted. > > In case of external table, you don't have ownership of the data and hence > when you delete such a table, the underlying data is not deleted. Only > metadata is deleted. > > Now, recently i have observed that you can not create an external table > over a location on which you don't have write (modification) permissions in > HDFS. I completely fail to understand this. > > Use case: It is quite common that the data you are churning is huge and > read-only. So, to churn such data via Hive, will you have to copy this huge > data to a location on which you have write permissions? > > Please help. > > My data is located in a hdfs folder > (/data/SentimentFiles/SentimentFiles/upload/data/tweets_raw/) on which I > only have readonly permission. And I am trying to execute the following > command > > *CREATE EXTERNAL TABLE tweets_raw (* > * id BIGINT,* > * created_at STRING,* > * source STRING,* > * favorited BOOLEAN,* > * retweet_count INT,* > * retweeted_status STRUCT<* > * text:STRING,* > * users:STRUCT<screen_name:STRING,name:STRING>>,* > * entities STRUCT<* > * urls:ARRAY<STRUCT<expanded_url:STRING>>,* > * user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,* > * hashtags:ARRAY<STRUCT<text:STRING>>>,* > * text STRING,* > * user1 STRUCT<* > * screen_name:STRING,* > * name:STRING,* > * friends_count:INT,* > * followers_count:INT,* > * statuses_count:INT,* > * verified:BOOLEAN,* > * utc_offset:STRING, -- was INT but nulls are strings* > * time_zone:STRING>,* > * in_reply_to_screen_name STRING,* > * year int,* > * month int,* > * day int,* > * hour int* > * )* > * ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'* > * WITH SERDEPROPERTIES ("ignore.malformed.json" = "true")* > * LOCATION > '/data/SentimentFiles/SentimentFiles/upload/data/tweets_raw/'* > * ;* > > It throws the following error: > > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. > MetaException(message:java.security.AccessControlException: Permission > denied: user=sandeep, access=WRITE, > inode="/data/SentimentFiles/SentimentFiles/upload/data/tweets_raw":hdfs:hdfs:drwxr-xr-x > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:219) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1771) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1755) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPathAccess(FSDirectory.java:1729) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAccess(FSNamesystem.java:8348) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkAccess(NameNodeRpcServer.java:1978) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.checkAccess(ClientNamenodeProtocolServerSideTranslatorPB.ja > va:1443) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProto > s.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145) > > > > -- > Regards, > Sandeep Giri, > +1-(347) 781-4573 (US) > +91-953-899-8962 (IN) > www.CloudxLab.com (A Hadoop cluster for practicing) > >