Spark standalone is not multi tenant you need one clusters per job. Maybe you can try fair scheduling and use one cluster but i doubt it will be prod ready...
Le 27 avr. 2017 5:28 AM, "anna stax" <annasta...@gmail.com> a écrit : > Thanks Cody, > > As I already mentioned I am running spark streaming on EC2 cluster in > standalone mode. Now in addition to streaming, I want to be able to run > spark batch job hourly and adhoc queries using Zeppelin. > > Can you please confirm that a standalone cluster is OK for this. Please > provide me some links to help me get started. > > Thanks > -Anna > > On Wed, Apr 26, 2017 at 7:46 PM, Cody Koeninger <c...@koeninger.org> > wrote: > >> The standalone cluster manager is fine for production. Don't use Yarn >> or Mesos unless you already have another need for it. >> >> On Wed, Apr 26, 2017 at 4:53 PM, anna stax <annasta...@gmail.com> wrote: >> > Hi Sam, >> > >> > Thank you for the reply. >> > >> > What do you mean by >> > I doubt people run spark in a. Single EC2 instance, certainly not in >> > production I don't think >> > >> > What is wrong in having a data pipeline on EC2 that reads data from >> kafka, >> > processes using spark and outputs to cassandra? Please explain. >> > >> > Thanks >> > -Anna >> > >> > On Wed, Apr 26, 2017 at 2:22 PM, Sam Elamin <hussam.ela...@gmail.com> >> wrote: >> >> >> >> Hi Anna >> >> >> >> There are a variety of options for launching spark clusters. I doubt >> >> people run spark in a. Single EC2 instance, certainly not in >> production I >> >> don't think >> >> >> >> I don't have enough information of what you are trying to do but if you >> >> are just trying to set things up from scratch then I think you can >> just use >> >> EMR which will create a cluster for you and attach a zeppelin instance >> as >> >> well >> >> >> >> >> >> You can also use databricks for ease of use and very little management >> but >> >> you will pay a premium for that abstraction >> >> >> >> >> >> Regards >> >> Sam >> >> On Wed, 26 Apr 2017 at 22:02, anna stax <annasta...@gmail.com> wrote: >> >>> >> >>> I need to setup a spark cluster for Spark streaming and scheduled >> batch >> >>> jobs and adhoc queries. >> >>> Please give me some suggestions. Can this be done in standalone mode. >> >>> >> >>> Right now we have a spark cluster in standalone mode on AWS EC2 >> running >> >>> spark streaming application. Can we run spark batch jobs and zeppelin >> on the >> >>> same. Do we need a better resource manager like Mesos? >> >>> >> >>> Are there any companies or individuals that can help in setting this >> up? >> >>> >> >>> Thank you. >> >>> >> >>> -Anna >> > >> > >> > >