Re: Spark as standalone or with Hadoop stack.

Ted Yu Tue, 22 Sep 2015 13:04:37 -0700

bq. it's relatively harder to use it with HBase

I agree with Sean.
I work on HBase. To my knowledge, no one runs HBase on top of Mesos.


On Tue, Sep 22, 2015 at 12:31 PM, Sean Owen <so...@cloudera.com> wrote:

> Who told you Mesos would make Spark 100x faster? does it make sense
> that just the resource manager could make that kind of difference?
> This sounds entirely wrong, or, maybe a mishearing.
>
> I don't know if Mesos is somehow easier to use with Cassandra, but
> it's relatively harder to use it with HBase, HDFS, etc. You probably
> want to use the Hadoop resource manager, YARN, if using Hadoop-ish
> stack components.
>
> As for Spark, the YARN integration actually has some advantages at the
> moment, like dynamic allocation. I think the security story is more
> complete too (? not sure).
>
> On Tue, Sep 22, 2015 at 8:25 PM, Shiv Kandavelu
> <shiv.kandav...@riversand.com> wrote:
> >
> >
> > Hi All,
> >
> >
> >
> > We currently have a Hadoop cluster having Yarn as the resource manager.
> >
> > We are planning to use HBase as the data store due to the C-P aspects of
> the
> > CAP Theorem.
> >
> > We now want to do extensive data processing both stored data in HBase as
> > well as Steam processing from online website / API
> >
> > We now want to use both Spark/Mapreduce on an existing Hadoop cluster.
> >
> >
> >
> > One of the recommendation we got was to use Spark Cluster as a standalone
> > with Mesos as a resource manager on top of it to Monitor and scale. The
> > reason for this recommendation is that Standalone Spark with Mesos is
> 100x
> > faster than the Spark/Yarn/Hadoop combination. It was also mentioned that
> > building on Spark/Mesos can help automatically add spark nodes on the fly
> > for processing to scale. Also, it is easy to switch the bottom data stack
> > HBASE to Cassandra or something else if we use Spark.
> >
> >
> >
> > We are in the process of evaluating which stack will work best and with
> the
> > knowledge we have, it is getting tough to pick one versus the other b/c
> of
> > our inexperience in these platforms.
> >
> >
> >
> > Can you help us understand the pros and cons of having Spark as a
> Standalone
> > cluster Vs running on top of Hadoop stack?
> >
> >
> >
> > Thanks!
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Spark as standalone or with Hadoop stack.

Reply via email to