Spark standalone is not multi tenant you need one clusters per job. Maybe
you can try fair scheduling and use one cluster but i doubt it will be prod
ready...

Le 27 avr. 2017 5:28 AM, "anna stax" <annasta...@gmail.com> a écrit :

> Thanks Cody,
>
> As I already mentioned I am running spark streaming on EC2 cluster in
> standalone mode. Now in addition to streaming, I want to be able to run
> spark batch job hourly and adhoc queries using Zeppelin.
>
> Can you please confirm that a standalone cluster is OK for this. Please
> provide me some links to help me get started.
>
> Thanks
> -Anna
>
> On Wed, Apr 26, 2017 at 7:46 PM, Cody Koeninger <c...@koeninger.org>
> wrote:
>
>> The standalone cluster manager is fine for production.  Don't use Yarn
>> or Mesos unless you already have another need for it.
>>
>> On Wed, Apr 26, 2017 at 4:53 PM, anna stax <annasta...@gmail.com> wrote:
>> > Hi Sam,
>> >
>> > Thank you for the reply.
>> >
>> > What do you mean by
>> > I doubt people run spark in a. Single EC2 instance, certainly not in
>> > production I don't think
>> >
>> > What is wrong in having a data pipeline on EC2 that reads data from
>> kafka,
>> > processes using spark and outputs to cassandra? Please explain.
>> >
>> > Thanks
>> > -Anna
>> >
>> > On Wed, Apr 26, 2017 at 2:22 PM, Sam Elamin <hussam.ela...@gmail.com>
>> wrote:
>> >>
>> >> Hi Anna
>> >>
>> >> There are a variety of options for launching spark clusters. I doubt
>> >> people run spark in a. Single EC2 instance, certainly not in
>> production I
>> >> don't think
>> >>
>> >> I don't have enough information of what you are trying to do but if you
>> >> are just trying to set things up from scratch then I think you can
>> just use
>> >> EMR which will create a cluster for you and attach a zeppelin instance
>> as
>> >> well
>> >>
>> >>
>> >> You can also use databricks for ease of use and very little management
>> but
>> >> you will pay a premium for that abstraction
>> >>
>> >>
>> >> Regards
>> >> Sam
>> >> On Wed, 26 Apr 2017 at 22:02, anna stax <annasta...@gmail.com> wrote:
>> >>>
>> >>> I need to setup a spark cluster for Spark streaming and scheduled
>> batch
>> >>> jobs and adhoc queries.
>> >>> Please give me some suggestions. Can this be done in standalone mode.
>> >>>
>> >>> Right now we have a spark cluster in standalone mode on AWS EC2
>> running
>> >>> spark streaming application. Can we run spark batch jobs and zeppelin
>> on the
>> >>> same. Do we need a better resource manager like Mesos?
>> >>>
>> >>> Are there any companies or individuals that can help in setting this
>> up?
>> >>>
>> >>> Thank you.
>> >>>
>> >>> -Anna
>> >
>> >
>>
>
>

Reply via email to