Fred Tzeng created ARROW-6044:
---------------------------------
Summary: Pyarrow HDFS client gets hung after a while
Key: ARROW-6044
URL: https://issues.apache.org/jira/browse/ARROW-6044
Project: Apache Arrow
Issue Type: Bug
Components: Python
Affects Versions: 0.13.0
Environment: hadoop-3.0.3
driver='libhdfs'
python 3.6
Centos7
Reporter: Fred Tzeng
I'm using the pyarrow HDFS client in a long running (forever) app that makes
connections to HDFS as external requests come in and destroys the connection as
soon as the request is handled. This happens a large amount of times on
separate threads and everything works great.
The problem is, after the app idles for a while (perhaps hours) and no HDFS
connections are made during this time, when the next connection is attempted,
the API hdfs.connect(...) just hangs. No exceptions are thrown.
Code snippet on what i'm doing to instantiate each connection:
...
hdfs = pyarrow.hdfs.connect(self.hdfs_authority, self.hdfs_port,
user=self.hdfs_user)
try:
//Do something
finally:
hdfs.close
Any help on what might be causing these hangs is appreciated
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)