Martin Durant created ARROW-1320: ------------------------------------ Summary: hdfs block locations Key: ARROW-1320 URL: https://issues.apache.org/jira/browse/ARROW-1320 Project: Apache Arrow Issue Type: Improvement Reporter: Martin Durant
To provide a function which can return the set of machines on which the data blocks of a given hdfs file are stored. This is best for scheduling systems (e.g., dask) which can move the computation to the machine which has the data, and so cut out network data traffic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)