;
>
>
> From: Andy Dang
> Sent: Wednesday, September 30, 2015 8:17 PM
> To: Nicolae Marasoiu
> Cc: user@spark.apache.org
> Subject: Re: sc.parallelize with defaultParallelism=1
>
> Can't you just load the data from HBase first, and then
m/r part.
From: Andy Dang
Sent: Wednesday, September 30, 2015 8:17 PM
To: Nicolae Marasoiu
Cc: user@spark.apache.org
Subject: Re: sc.parallelize with defaultParallelism=1
Can't you just load the data from HBase first, and then call sc.parallelize on
your dataset?
-Andy
---
Can't you just load the data from HBase first, and then call sc.parallelize
on your dataset?
-Andy
---
Regards,
Andy (Nam) Dang
On Wed, Sep 30, 2015 at 12:52 PM, Nicolae Marasoiu <
nicolae.maras...@adswizz.com> wrote:
> Hi,
>
>
> When calling sc.parallelize(data,1), is there a preference wh
Hi,
When calling sc.parallelize(data,1), is there a preference where to put the
data? I see 2 possibilities: sending it to a worker node, or keeping it on the
driver program.
I would prefer to keep the data local to the driver. The use case is when I
need just to load a bit of data from HBas