Re: Dealing with 'smaller' data

Gary Malouf Thu, 26 Feb 2015 17:52:31 -0800

The honest answer is that it is unclear to me at this point.  I guess what
I am really wondering is if there are cases where one would find it
beneficial to use Spark against one or more RDBs?


On Thu, Feb 26, 2015 at 8:06 PM, Tobias Pfeiffer <[email protected]> wrote:

> Gary,
>
> On Fri, Feb 27, 2015 at 8:40 AM, Gary Malouf <[email protected]>
> wrote:
>
>> I'm considering whether or not it is worth introducing Spark at my new
>> company.  The data is no-where near Hadoop size at this point (it sits in
>> an RDS Postgres cluster).
>>
>
> Will it ever become "Hadoop size"? Looking at the overhead of running even
> a simple Hadoop setup (securely and with good performance, given about 1e6
> configuration parameters), I think it makes sense to stay in non-Hadoop
> mode as long as possible. People may disagree ;-)
>
> Tobias
>
> PS. You may also want to have a look at
> http://aadrake.com/command-line-tools-can-be-235x-faster-than-your-hadoop-cluster.html
>
>

Re: Dealing with 'smaller' data

Reply via email to