is this location correct and valid?

LOCATION '/data/SentimentFiles/*SentimentFiles*/upload/data/tweets_raw/'

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 31 May 2016 at 08:50, Sandeep Giri <sand...@cloudxlab.com> wrote:

> Hi Hive Team,
>
> As per my understanding, in Hive, you can create two kinds of tables:
> Managed and External.
>
> In case of managed table, you own the data and hence when you drop the
> table the data is deleted.
>
> In case of external table, you don't have ownership of the data and hence
> when you delete such a table, the underlying data is not deleted. Only
> metadata is deleted.
>
> Now, recently i have observed that you can not create an external table
> over a location on which you don't have write (modification) permissions in
> HDFS. I completely fail to understand this.
>
> Use case: It is quite common that the data you are churning is huge and
> read-only. So, to churn such data via Hive, will you have to copy this huge
> data to a location on which you have write permissions?
>
> Please help.
>
> My data is located in a hdfs folder
> (/data/SentimentFiles/SentimentFiles/upload/data/tweets_raw/)  on which I
> only have readonly permission. And I am trying to execute the following
> command
>
> *CREATE EXTERNAL TABLE tweets_raw (*
> *        id BIGINT,*
> *        created_at STRING,*
> *        source STRING,*
> *        favorited BOOLEAN,*
> *        retweet_count INT,*
> *        retweeted_status STRUCT<*
> *        text:STRING,*
> *        users:STRUCT<screen_name:STRING,name:STRING>>,*
> *        entities STRUCT<*
> *        urls:ARRAY<STRUCT<expanded_url:STRING>>,*
> *        user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,*
> *        hashtags:ARRAY<STRUCT<text:STRING>>>,*
> *        text STRING,*
> *        user1 STRUCT<*
> *        screen_name:STRING,*
> *        name:STRING,*
> *        friends_count:INT,*
> *        followers_count:INT,*
> *        statuses_count:INT,*
> *        verified:BOOLEAN,*
> *        utc_offset:STRING, -- was INT but nulls are strings*
> *        time_zone:STRING>,*
> *        in_reply_to_screen_name STRING,*
> *        year int,*
> *        month int,*
> *        day int,*
> *        hour int*
> *        )*
> *        ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'*
> *        WITH SERDEPROPERTIES ("ignore.malformed.json" = "true")*
> *        LOCATION
> '/data/SentimentFiles/SentimentFiles/upload/data/tweets_raw/'*
> *        ;*
>
> It throws the following error:
>
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.DDLTask.
> MetaException(message:java.security.AccessControlException: Permission
> denied: user=sandeep, access=WRITE,
> inode="/data/SentimentFiles/SentimentFiles/upload/data/tweets_raw":hdfs:hdfs:drwxr-xr-x
>         at
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:219)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1771)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1755)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPathAccess(FSDirectory.java:1729)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAccess(FSNamesystem.java:8348)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkAccess(NameNodeRpcServer.java:1978)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.checkAccess(ClientNamenodeProtocolServerSideTranslatorPB.ja
> va:1443)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProto
> s.java)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145)
>
>
>
> --
> Regards,
> Sandeep Giri,
> +1-(347) 781-4573 (US)
> +91-953-899-8962 (IN)
> www.CloudxLab.com  (A Hadoop cluster for practicing)
>
>

Reply via email to