László Végh created HIVE-26966:
----------------------------------

             Summary: Hive is unable to delete Azure storage objects
                 Key: HIVE-26966
                 URL: https://issues.apache.org/jira/browse/HIVE-26966
             Project: Hive
          Issue Type: Improvement
            Reporter: László Végh


While writing data on cloud hive uses the expected RAZ authenticated way (using 
the access by Managed Identity), HiveProtoEventsCleanerTask is following a 
different approach, and tries to delete the data using the directory owner, 
which may not available in Ranger.

To  solve this issue either
 * investigate how authentication works for data writing and implement it for 
deletion as well (preferred solution)
 * or introduce a new configuration value holding the name of the user who 
needs to be used for deleting the data.

 

related hadoop logs: 
{code:java}
2022-12-07 11:30:07,163 WARN 
org.apache.hadoop.security.ShellBasedUnixGroupsMapping: [pool-310888-thread-7]: 
unable to return groups for user 9ffea8fa-dec1-49ea-bb45-72bcb43951e8
org.apache.hadoop.security.ShellBasedUnixGroupsMapping$PartialGroupNameException:
 The user name '9ffea8fa-dec1-49ea-bb45-72bcb43951e8' is not found. id: 
9ffea8fa-dec1-49ea-bb45-72bcb43951e8: no such user
id: 9ffea8fa-dec1-49ea-bb45-72bcb43951e8: no such user

2022-12-07 11:30:07,164 ERROR 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore: [pool-310888-thread-7]: 
Failed to get primary group for 9ffea8fa-dec1-49ea-bb45-72bcb43951e8, using 
user name as primary group name

2022-12-07 11:30:07,231 INFO 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager: 
[TezSessionPool-expiration]: Created new tez session for queue: default with 
session id: df027903-43dd-46a8-b654-a25834f2b90d
{code}
ranger logs:
{code:java}
2022-12-07 11:29:20,693 INFO 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
 Token cancellation requested for identifier: (ABFS delegation owner=hive, 
renewer=yarn, realUser=, issueDate=1670411627352, maxDate=1671016427352, 
sequenceNumber=24065, masterKeyId=95)

2022-12-07 11:30:07,316 WARN 
org.apache.hadoop.security.ShellBasedUnixGroupsMapping: unable to return groups 
for user 9ffea8fa-dec1-49ea-bb45-72bcb43951e8
PartialGroupNameException The user name '9ffea8fa-dec1-49ea-bb45-72bcb43951e8' 
is not found. id: 9ffea8fa-dec1-49ea-bb45-72bcb43951e8: no such user
id: 9ffea8fa-dec1-49ea-bb45-72bcb43951e8: no such user

2022-12-07 11:30:07,317 ERROR org.apache.ranger.raz.rest.AuthzREST: 
AuthzREST.authorizeAccess()
org.apache.ranger.raz.intg.RangerRazException: not authorized to perform 
delete-recursive on path 
abfs://d...@s05p1appcdp001.dfs.core.windows.net/warehouse/tablespace/external/hive/sys.db/query_data/date=2021-08-20
at 
org.apache.ranger.raz.processor.adls.AdlsGen2RazProcessor.generateDSASToken(AdlsGen2RazProcessor.java:216)
{code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to