: 12/08/2015 10:51
Subject:Re: possible issues with listing objects in the
HadoopFSrelation
Hi Gil,
Sorry for the late reply and thanks for raising this question. The file
listing logic in HadoopFsRelation is intentionally made different from
Hadoop FileInputFormat. Here are the reasons
Hi Gil,
Sorry for the late reply and thanks for raising this question. The file
listing logic in HadoopFsRelation is intentionally made different from
Hadoop FileInputFormat. Here are the reasons:
1. Efficiency: when computing RDD partitions,
FileInputFormat.listStatus() is called on the dri
Just some thoughts, hope i didn't missed something obvious.
HadoopFSRelation calls directly FileSystem class to list files in the
path.
It looks like it implements basically the same logic as in the
FileInputFormat.listStatus method ( located in
hadoop-map-reduce-client-core)
The point is that