Re: Is JavaSparkContext.wholeTextFiles distributed?

Ted Yu Tue, 26 Apr 2016 07:53:36 -0700

Please take a look at:
core/src/main/scala/org/apache/spark/SparkContext.scala


   * Do `val rdd = sparkContext.wholeTextFile("hdfs://a-hdfs-path")`,
   *
   * <p> then `rdd` contains
   * {{{
   *   (a-hdfs-path/part-00000, its content)
   *   (a-hdfs-path/part-00001, its content)
   *   ...
   *   (a-hdfs-path/part-nnnnn, its content)
   * }}}
...
  * @param minPartitions A suggestion value of the minimal splitting number
for input data.

  def wholeTextFiles(
      path: String,
      minPartitions: Int = defaultMinPartitions): RDD[(String, String)] =
withScope {

On Tue, Apr 26, 2016 at 7:43 AM, Vadim Vararu <[email protected]>
wrote:

> Hi guys,
>
> I'm trying to read many filed from s3 using
> JavaSparkContext.wholeTextFiles(...). Is that executed in a distributed
> manner? Please give me a link to the place in documentation where it's
> specified.
>
> Thanks, Vadim.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: Is JavaSparkContext.wholeTextFiles distributed?

Reply via email to