Please take a look at:
core/src/main/scala/org/apache/spark/SparkContext.scala
* Do `val rdd = sparkContext.wholeTextFile("hdfs://a-hdfs-path")`,
*
* <p> then `rdd` contains
* {{{
* (a-hdfs-path/part-00000, its content)
* (a-hdfs-path/part-00001, its content)
* ...
* (a-hdfs-path/part-nnnnn, its content)
* }}}
...
* @param minPartitions A suggestion value of the minimal splitting number
for input data.
def wholeTextFiles(
path: String,
minPartitions: Int = defaultMinPartitions): RDD[(String, String)] =
withScope {
On Tue, Apr 26, 2016 at 7:43 AM, Vadim Vararu <[email protected]>
wrote:
> Hi guys,
>
> I'm trying to read many filed from s3 using
> JavaSparkContext.wholeTextFiles(...). Is that executed in a distributed
> manner? Please give me a link to the place in documentation where it's
> specified.
>
> Thanks, Vadim.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>