Re: question about HadoopFsRelation

2015-10-25 Thread Koert Kuipers
thanks i will read up on that On Sat, Oct 24, 2015 at 12:53 PM, Ted Yu wrote: > The code below was introduced by SPARK-7673 / PR #6225 > > See item #1 in the description of the PR. > > Cheers > > On Sat, Oct 24, 2015 at 12:59 AM, Koert Kuipers wrote: > >> the code that seems to flatMap director

Re: question about HadoopFsRelation

2015-10-24 Thread Ted Yu
The code below was introduced by SPARK-7673 / PR #6225 See item #1 in the description of the PR. Cheers On Sat, Oct 24, 2015 at 12:59 AM, Koert Kuipers wrote: > the code that seems to flatMap directories to all the files inside is in > the private HadoopFsRelation.buildScan: > > // First a

Re: question about HadoopFsRelation

2015-10-24 Thread Koert Kuipers
the code that seems to flatMap directories to all the files inside is in the private HadoopFsRelation.buildScan: // First assumes `input` is a directory path, and tries to get all files contained in it. fileStatusCache.leafDirToChildrenFiles.getOrElse( path, // Otherwise,

question about HadoopFsRelation

2015-10-23 Thread Koert Kuipers
i noticed in the comments for HadoopFsRelation.buildScan it says: * @param inputFiles For a non-partitioned relation, it contains paths of all data files in the *relation. For a partitioned relation, it contains paths of all data files in a single *selected partition. do i