If the file is not splittable(can I assume the log file is splittable,
though) can you advise on how spark handles such case� If Spark can't what
is the widely used practice?
On 3 Sep 2016 7:29 pm, "Raghavendra Pandey"
wrote:
If your file format is splittable say TSV, CSV etc, it will be distri
If your file format is splittable say TSV, CSV etc, it will be distributed
across all executors.
On Sat, Sep 3, 2016 at 3:38 PM, Somasundaram Sekar <
somasundar.se...@tigeranalytics.com> wrote:
> Hi All,
>
>
>
> Would like to gain some understanding on the questions listed below,
>
>
>
> 1.
Hi All,
Would like to gain some understanding on the questions listed below,
1. When processing a large file with Apache Spark, with, say,
sc.textFile("somefile.xml"), does it split it for parallel processing
across executors or, will it be processed as a single chunk in a single
execut