[ 
https://issues.apache.org/jira/browse/HIVE-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HIVE-24342:
---------------------------------------
    Description: 
Currently isPathEncrypted will make sure path is from hdfs by checking the path 
scheme is "hdfs"

In the case if mounted ViewFileSystem based files systems like 
ViewFSOverloadScheme or ViewHDFS (HDFS-15289) may need o check resolved path is 
really hdfs.

In ViewHDFS case, we can mount hdfs://ns1/test ---> o3fs://b.v.ozone1/test

When user calling queries with the path hdfs://ns1/test, isPathEncrypted will 
think the path is from hdfs only as its checking path scheme.

 
{code:java}
0: jdbc:hive2://umag-1.umag.root.xxx.site:218> select * from test30;
Error: Error while compiling statement: FAILED: SemanticException Unable to 
determine if hdfs://ns1/test is encrypted: 
java.lang.UnsupportedOperationException: This API:getEZForPath is specific to 
DFS. Can't run on other fs:o3fs://bucket.volume.ozone1 (state=42000,code=40000)
0: jdbc:hive2://umag-1.umag.root.xxx.site:218> cd Closing: 0: 
jdbc:hive2://umag-1.umag.root.xxx.site:2181,umag-2.umag.root.xxx.site:2181,umag-5.umag.root.xxx.site:2181/default;password=root;principal=hive/umag-5.umag.root.xxx.s...@root.hwx.site;retries=5;serviceDiscoveryMode=zooKeeper;user=root;zooKeeperNamespace=hiveserver2
{code}
 

So, here we should use resolvePath to make sure the resolved path really in 
hdfs. If the resolved path is not from hdfs (in above case, it o3fs path), then 
it will return false.

After fixing this, the query is passing.:

 
{code:java}
0: jdbc:hive2://umag-1.umag.root.xxx.site:218> select * from test30;
INFO  : Compiling 
command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb): 
select * from test30
INFO  : No Stats for default@test30, Columns: item, user_id, state, order_id
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:test30.order_id, type:bigint, 
comment:null), FieldSchema(name:test30.user_id, type:string, comment:null), 
FieldSchema(name:test30.item, type:string, comment:null), 
FieldSchema(name:test30.state, type:string, comment:null)], properties:null)
INFO  : Completed compiling 
command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb); Time 
taken: 4.47 seconds
INFO  : Executing 
command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb): 
select * from test30
INFO  : Completed executing 
command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb); Time 
taken: 0.09 seconds
INFO  : OK
+------------------+-----------------+--------------+---------------+
| test30.order_id  | test30.user_id  | test30.item  | test30.state  |
+------------------+-----------------+--------------+---------------+
| 1234             | u1              | iphone7      | CA            |
| 2345             | u1              | ipad         | CA            |
| 3456             | u2              | desktop      | NY            |
 
 
+------------------+-----------------+--------------+---------------+
11 rows selected (6.975 seconds)
{code}
 

  was:
Currently isPathEncrypted will make sure path is from hdfs by check the path 
scheme is "hdfs"

In the case if mounted ViewFileSystem based files systems like 
ViewFSOverloadScheme or ViewHDFS (HDFS-15289) may need o check resolved path is 
really hdfs.

In ViewHDFS case, we can mount hdfs://ns1/test ---> o3fs://b.v.ozone1/test

When user calling queries with the path hdfs://ns1/test, isPathEncrypted will 
think the path is from hdfs only as its checking path scheme.

 
{code:java}
0: jdbc:hive2://umag-1.umag.root.xxx.site:218> select * from test30;
Error: Error while compiling statement: FAILED: SemanticException Unable to 
determine if hdfs://ns1/test is encrypted: 
java.lang.UnsupportedOperationException: This API:getEZForPath is specific to 
DFS. Can't run on other fs:o3fs://bucket.volume.ozone1 (state=42000,code=40000)
0: jdbc:hive2://umag-1.umag.root.xxx.site:218> cd Closing: 0: 
jdbc:hive2://umag-1.umag.root.xxx.site:2181,umag-2.umag.root.xxx.site:2181,umag-5.umag.root.xxx.site:2181/default;password=root;principal=hive/umag-5.umag.root.xxx.s...@root.hwx.site;retries=5;serviceDiscoveryMode=zooKeeper;user=root;zooKeeperNamespace=hiveserver2
{code}
 

So, here we should use resolvePath to make sure the resolved path really in 
hdfs. If the resolved path is not from hdfs (in above case, it o3fs path), then 
it will return false.

After fixing this, the query is passing.:

 
{code:java}
0: jdbc:hive2://umag-1.umag.root.xxx.site:218> select * from test30;
INFO  : Compiling 
command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb): 
select * from test30
INFO  : No Stats for default@test30, Columns: item, user_id, state, order_id
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:test30.order_id, type:bigint, 
comment:null), FieldSchema(name:test30.user_id, type:string, comment:null), 
FieldSchema(name:test30.item, type:string, comment:null), 
FieldSchema(name:test30.state, type:string, comment:null)], properties:null)
INFO  : Completed compiling 
command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb); Time 
taken: 4.47 seconds
INFO  : Executing 
command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb): 
select * from test30
INFO  : Completed executing 
command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb); Time 
taken: 0.09 seconds
INFO  : OK
+------------------+-----------------+--------------+---------------+
| test30.order_id  | test30.user_id  | test30.item  | test30.state  |
+------------------+-----------------+--------------+---------------+
| 1234             | u1              | iphone7      | CA            |
| 2345             | u1              | ipad         | CA            |
| 3456             | u2              | desktop      | NY            |
 
 
+------------------+-----------------+--------------+---------------+
11 rows selected (6.975 seconds)
{code}
 


> isPathEncrypted should make sure resolved path also from HDFS
> -------------------------------------------------------------
>
>                 Key: HIVE-24342
>                 URL: https://issues.apache.org/jira/browse/HIVE-24342
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2, Shims
>    Affects Versions: 3.1.2
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently isPathEncrypted will make sure path is from hdfs by checking the 
> path scheme is "hdfs"
> In the case if mounted ViewFileSystem based files systems like 
> ViewFSOverloadScheme or ViewHDFS (HDFS-15289) may need o check resolved path 
> is really hdfs.
> In ViewHDFS case, we can mount hdfs://ns1/test ---> o3fs://b.v.ozone1/test
> When user calling queries with the path hdfs://ns1/test, isPathEncrypted will 
> think the path is from hdfs only as its checking path scheme.
>  
> {code:java}
> 0: jdbc:hive2://umag-1.umag.root.xxx.site:218> select * from test30;
> Error: Error while compiling statement: FAILED: SemanticException Unable to 
> determine if hdfs://ns1/test is encrypted: 
> java.lang.UnsupportedOperationException: This API:getEZForPath is specific to 
> DFS. Can't run on other fs:o3fs://bucket.volume.ozone1 
> (state=42000,code=40000)
> 0: jdbc:hive2://umag-1.umag.root.xxx.site:218> cd Closing: 0: 
> jdbc:hive2://umag-1.umag.root.xxx.site:2181,umag-2.umag.root.xxx.site:2181,umag-5.umag.root.xxx.site:2181/default;password=root;principal=hive/umag-5.umag.root.xxx.s...@root.hwx.site;retries=5;serviceDiscoveryMode=zooKeeper;user=root;zooKeeperNamespace=hiveserver2
> {code}
>  
> So, here we should use resolvePath to make sure the resolved path really in 
> hdfs. If the resolved path is not from hdfs (in above case, it o3fs path), 
> then it will return false.
> After fixing this, the query is passing.:
>  
> {code:java}
> 0: jdbc:hive2://umag-1.umag.root.xxx.site:218> select * from test30;
> INFO  : Compiling 
> command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb): 
> select * from test30
> INFO  : No Stats for default@test30, Columns: item, user_id, state, order_id
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Created Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:test30.order_id, type:bigint, 
> comment:null), FieldSchema(name:test30.user_id, type:string, comment:null), 
> FieldSchema(name:test30.item, type:string, comment:null), 
> FieldSchema(name:test30.state, type:string, comment:null)], properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb); 
> Time taken: 4.47 seconds
> INFO  : Executing 
> command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb): 
> select * from test30
> INFO  : Completed executing 
> command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb); 
> Time taken: 0.09 seconds
> INFO  : OK
> +------------------+-----------------+--------------+---------------+
> | test30.order_id  | test30.user_id  | test30.item  | test30.state  |
> +------------------+-----------------+--------------+---------------+
> | 1234             | u1              | iphone7      | CA            |
> | 2345             | u1              | ipad         | CA            |
> | 3456             | u2              | desktop      | NY            |
>  
>  
> +------------------+-----------------+--------------+---------------+
> 11 rows selected (6.975 seconds)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to