Re: 2 input paths generate 3 partitions

2015-03-27 Thread Rares Vernica
DFS. > > Did you set the defaultParallelism to be 3 in your spark? > > Yong > > ------ > Subject: Re: 2 input paths generate 3 partitions > From: zzh...@hortonworks.com > To: rvern...@gmail.com > CC: user@spark.apache.org > Date: Fri, 27 Mar 2015 23:

RE: 2 input paths generate 3 partitions

2015-03-27 Thread java8964
The files sound too small to be 2 blocks in HDFS. Did you set the defaultParallelism to be 3 in your spark? Yong Subject: Re: 2 input paths generate 3 partitions From: zzh...@hortonworks.com To: rvern...@gmail.com CC: user@spark.apache.org Date: Fri, 27 Mar 2015 23:15:38 + Hi Rares

Re: 2 input paths generate 3 partitions

2015-03-27 Thread Zhan Zhang
Hi Rares, The number of partition is controlled by HDFS input format, and one file may have multiple partitions if it consists of multiple block. In you case, I think there is one file with 2 splits. Thanks. Zhan Zhang On Mar 27, 2015, at 3:12 PM, Rares Vernica mailto:rvern...@gmail.com>> wro

2 input paths generate 3 partitions

2015-03-27 Thread Rares Vernica
Hello, I am using the Spark shell in Scala on the localhost. I am using sc.textFile to read a directory. The directory looks like this (generated by another Spark script): part-0 part-1 _SUCCESS The part-0 has four short lines of text while part-1 has two short lines of text. Th