Re: S3n, parallelism, partitions

Akhil Das Mon, 17 Aug 2015 02:10:43 -0700

s3n underneath uses the hadoop api, so i guess it would partition according
to your hadoop configuration (128MB per partition by default)


Thanks
Best Regards

On Mon, Aug 17, 2015 at 2:29 PM, matd <matd...@gmail.com> wrote:

> Hello,
>
> I would like to understand how the work is parallelized accross a Spark
> cluster (and what is left to the driver) when I read several files from a
> single folder in s3 "s3n://bucket_xyz/some_folder_having_many_files_in_it/"
>
> How files (or file parts) are mapped to partitions ?
>
> Thanks
> Mathieu
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/S3n-parallelism-partitions-tp24293.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: S3n, parallelism, partitions

Reply via email to