>> In hive, FileUtils.checkFileAccessWithImpersonation can be fixed to use create UGI once to reduce the impact (suspecting this will have 50% impact).
Looked closely at the method impl for "FileUtils.checkFileAccessWithImpersonation". It doesn't make 2 connections; 50% impact may not be relevant here. On Thu, Sep 1, 2022 at 4:48 AM Rajesh Balamohan <rbalamo...@apache.org> wrote: > > W.r.t to connection reuse issues, LLAP had a similar issue (not in HMS) > https://issues.apache.org/jira/browse/HIVE-16020. It was making a > connection in every task and UGI had to be persisted in the QueryInfo level > to reduce the impact. > > In hive, FileUtils.checkFileAccessWithImpersonation can be fixed to use > create UGI once to reduce the impact (suspecting this will have 50% > impact). > > > https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L418 > > https://github.com/apache/hive/blob/d06957f254e026e719f30027d161264be43386b0/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L461 > > May have to explore whether a local cache with expiry in FileUtils can > help reduce the impact further. > > ~Rajesh.B > > > On Thu, Sep 1, 2022 at 1:24 AM Owen O'Malley <owen.omal...@gmail.com> > wrote: > >> We're using HMS with Storage-Based Authorization and have been having >> trouble with the HMS running out of threads. Looking at the jstack & code, >> it appears to that the problem is that RPC's ConnectionId is using UGI's >> equal/hash, which uses the Subject's Object equals/hash. Proxy user UGI's >> always create a new Subject and thus are always unique. >> >> This leads to the HMS creating too many threads. I've created a jira in >> Hadoop. https://issues.apache.org/jira/browse/HADOOP-18434 >> >> Thanks, >> Owen >> >