Vikram Ahuja created HADOOP-19460:
-------------------------------------

             Summary: High number of Threads Launched when Calling 
fs.getFileStatus() via proxyUser after Kerberos authentication.
                 Key: HADOOP-19460
                 URL: https://issues.apache.org/jira/browse/HADOOP-19460
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 3.3.6
            Reporter: Vikram Ahuja


We have observed an issue where very large number of threads are being launched 
when performing concurrent {{fs.getFileStatus(path) operations}} as proxyUser.

Although this issue was observed in our hive services, we were able to 
replicate this issue without hive by writing a sample standalone program which 
first logs in via a principal and keytab and then creates a proxy user and 
fires concurrent {{fs.getFileStatus(path)}} for a few mins. Eventually when the 
concurrency increases it tries to create more threads than max available 
threads(ulimit range) and the process eventually slows down.
{code:java}
UserGroupInformation proxyUserUGI = UserGroupInformation.createProxyUser(
"hive", UserGroupInformation.getLoginUser());{code}
In this particular case, when launching 30 concurrent threads calling , the max 
number of threads launched by the PID are 6066.
 
{code:java}
Every 1.0s: ps -eo nlwp,pid,args --sort -nlwp | head                            
                                           Wed Feb 19 06:12:47 2025
NLWP     PID COMMAND
6066  700718 /usr/lib/jvm/java-17-openjdk/bin/java -cp 
./test.jar:/usr/hadoop/*:/usr/hadoop/lib/*:/usr/hadoop-hdfs/* 
org.apache.hadoop.hive.common.HDFSFileStatusExample hdfs://namenode:8020 
principal keytab_location 30


{code}
 
 
 
 
But the same behaviour is not observed when the same calls are made using the 
current userUGI instead of proxyUser.
{code:java}
UserGroupInformation currentUserUgi = 
UserGroupInformation.getCurrentUser();{code}
 

In this case when launching 30 concurrent threads calling , the max number of 
threads launched by the PID are 56 and when launched with 500 concurrent 
threads the max number of threads launched are 524.
{code:java}
Every 1.0s: ps -eo nlwp,pid,args --sort -nlwp | head                            
                           Tue Feb 18 06:23:18 2025NLWP     PID COMMAND
  56  748244 /usr/lib/jvm/java-17-openjdk/bin/java -cp 
./test.jar:/usr/hadoop/*:/usr/hadoop/lib/*:/usr/hadoop-hdfs/* 
org.apache.hadoop.hive.common.HDFSFileStatus hdfs://namenode:8020 principal 
keytab_location 30



Every 1.0s: ps -eo nlwp,pid,args --sort -nlwp | head                            
                             Wed Feb 19 06:19:03 2025NLWP     PID COMMAND
 524  750984 /usr/lib/jvm/java-17-openjdk/bin/java -cp 
./test.jar:/usr/hadoop/*:/usr/hadoop/lib/*:/usr/hadoop-hdfs/* 
org.apache.hadoop.hive.common.HDFSFileStatus hdfs://namenode:8020 principal 
keytab_location 500{code}
 

I am attaching both the sample programs where in one case the calls are made by 
ProxyUser(issue occurs here) and in another case the call is made by 
currentUser(Works fine).

 

 

The command line args given for the sample program are:

arg[0] = namenode_host_name:port

arg[1] = principal

arg[2] = keytab_location

arg[3] = Number of threads



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to