>> In hive, FileUtils.checkFileAccessWithImpersonation can be fixed to use
create UGI once to reduce the impact (suspecting this will have 50%
impact).

Looked closely at the method impl for
"FileUtils.checkFileAccessWithImpersonation". It doesn't make 2
connections; 50% impact may not be relevant here.

On Thu, Sep 1, 2022 at 4:48 AM Rajesh Balamohan <rbalamo...@apache.org>
wrote:

>
> W.r.t to connection reuse issues, LLAP had a similar issue (not in HMS)
> https://issues.apache.org/jira/browse/HIVE-16020. It was making a
> connection in every task and UGI had to be persisted in the QueryInfo level
> to reduce the impact.
>
> In hive, FileUtils.checkFileAccessWithImpersonation can be fixed to use
> create UGI once to reduce the impact (suspecting this will have 50%
> impact).
>
>
> https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L418
>
> https://github.com/apache/hive/blob/d06957f254e026e719f30027d161264be43386b0/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L461
>
> May have to explore whether a local cache with expiry in FileUtils can
> help reduce the impact further.
>
> ~Rajesh.B
>
>
> On Thu, Sep 1, 2022 at 1:24 AM Owen O'Malley <owen.omal...@gmail.com>
> wrote:
>
>> We're using HMS with Storage-Based Authorization and have been having
>> trouble with the HMS running out of threads. Looking at the jstack & code,
>> it appears to that the problem is that RPC's ConnectionId is using UGI's
>> equal/hash, which uses the Subject's Object equals/hash. Proxy user UGI's
>> always create a new Subject and thus are always unique.
>>
>> This leads to the HMS creating too many threads. I've created a jira in
>> Hadoop. https://issues.apache.org/jira/browse/HADOOP-18434
>>
>> Thanks,
>>    Owen
>>
>

Reply via email to