CC'ing common-dev as that list has more activity

On 7/7/25 20:39, Matt wrote:
Hi Hadoop team!

I'm not sure where to report this and the Jira board does not allow for public sign up so I figured I'd start here. I found a thread leak in the ABFS driver that causes OutOfMemoryErrors in Hive Metastore environments -- specifically in this part of the code base:

https://github.com/apache/hadoop/blob/f099f08d922689dd2bd641bbbbd7c29c451463df/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingAnalyzer.java#L127

What seems to be the issue is that the timer tasks are cleaned up but the timer threads themselves are never actually cleaned up. This will eventually lead to an OOM since nothing is collecting these. I was able to reproduce this locally in 3.3.6 and 3.4.1 but I believe that it would affect any version that relies on autothrottling for ABFS.

I was also able to make a quick fix as well as confirm a workaround -- the long term fix would be to include `timer.cancel()` and `timer.purge()` in a method for AbfsClientThrottlingAnalyzer.java. The short term workaround is to disable autothrottling and rely on Azure to throttle the connections as needed with the below configuration.

```

<property>
  <name>fs.azure.enable.autothrottling</name>
  <value>false</value>
</property>

```

I'm happy to share my fix and test results but I'm not quite sure who to share them with -- any direction is greatly appreciated!

Thank you,
Matt


---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to