Hey Kam,

It would be nice to have a way to get a failed node back with it's
original data, but this isn't strictly necessary, it is just a good
optimization. As long as you run with replication you can restart a
broker elsewhere with no data, and it will restore it's state off the
other replicas.

-Jay

On Wed, Jul 23, 2014 at 3:47 PM, Kam Kasravi
<kamkasr...@yahoo.com.invalid> wrote:
> Hi
>
> Kafka-on-yarn requires YARN to consistently allocate a kafka broker at a 
> particular resource since the broker needs to always use its local data. YARN 
> doesn't do this well, unless you provide (override) the default scheduler 
> (CapacityScheduler or FairScheduler). SequenceIO did something along these 
> lines for a different use case. Unfortunately replacing the scheduler is a 
> global operation which would affect all App masters. Additionally one could 
> argue that the broker should be run as an OS service and auto restarted on 
> failure if necessary. Slider (incubating) did some of this groundwork but 
> YARN still has lots of limitations in providing guarantees to consistently 
> allocate a container on a particular node especially on appmaster restart (eg 
> ResourceManager dies). That said, it might be worthwhile to enumerate all of 
> this here with some possible solutions. If there is interest I could 
> certainly list the relevant JIRA's along with some additional JIRA's
>  required IMO.
>
> Thanks
> Kam
>
>
> On Wednesday, July 23, 2014 2:37 PM, "hsy...@gmail.com" <hsy...@gmail.com> 
> wrote:
>
>
>
> Hi guys,
>
> Kafka is getting more and more popular and in most cases people run kafka
> as long-term service in the cluster. Is there a discussion of running kafka
> on yarn cluster which we can utilize the convenient configuration/resource
> management and HA.  I think there is a big potential and requirement for
> that.
> I found a project https://github.com/kkasravi/kafka-yarn. But is there a
> official roadmap/plan for this?
>
> Thank you very much!
>
> Best,
> Siyuan

Reply via email to