[ https://issues.apache.org/jira/browse/ARROW-4802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rok Mihevc updated ARROW-4802: ------------------------------ External issue URL: https://github.com/apache/arrow/issues/21320 > [Python] Hadoop classpath discovery broken HADOOP_HOME is a symlink > ------------------------------------------------------------------- > > Key: ARROW-4802 > URL: https://issues.apache.org/jira/browse/ARROW-4802 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Reporter: Micah Kornfield > Assignee: Tiger068 > Priority: Major > Labels: pull-request-available > Fix For: 0.13.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > From [https://github.com/apache/arrow/issues/3748]: > CLASSPATH discovery was recently changed in > [{{d911850}}|https://github.com/apache/arrow/commit/d91185000945cec96abad41a230d05d3cdefff93] > to resolve ARROW-2113 and ARROW-3768. > Specifically, the logic used to find all jars under HADOOP_HOME uses the find > command directly > [arrow/python/pyarrow/hdfs.py|https://github.com/apache/arrow/blob/d91185000945cec96abad41a230d05d3cdefff93/python/pyarrow/hdfs.py#L144] > Line 144 in > [d911850|https://github.com/apache/arrow/commit/d91185000945cec96abad41a230d05d3cdefff93] > | |find_args = ('find', os.environ['HADOOP_HOME'], '-name', '*.jar')| > This will not work when HADOOP_HOME is a symlink, in which case '-L' needs to > be passed to the find command. > CLASSPATH can still be set explicitly, but this is a change in behavior as > HADOOP_HOME symlinks worked without issue before. -- This message was sent by Atlassian Jira (v8.20.10#820010)