Thanks Joe for the input related to Mesos as well as acknowledging the need for 
YARN to support this type of cluster allocation - long running services with 
node locality priority. 

Thanks Jay - That's an interesting fact that I wasn't aware of - though I 
imagine there could possibly be a long latency for the replica data to be 
transferred to the new broker (depending on #/size of partitions). It does open 
up some possibilities to restart brokers on app master restart using different 
containers  (as well as some complications if an old container with old data 
were reallocated on restart). I had used zookeeper to store broker locations so 
the app master on restart would look for this information and attempt to 
reallocate containers on these nodes.  All this said, would this be part of 
kafka or some other framework? I can see kafka benefitting from this at the 
same time kafka's appeal IMO is it's simplicity. Spark has chosen to include 
YARN within its distribution, not sure what the kafka team thinks. 



On Wednesday, July 23, 2014 4:19 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
 


Hey Kam,

It would be nice to have a way to get a failed node back with it's
original data, but this isn't strictly necessary, it is just a good
optimization. As long as you run with replication you can restart a
broker elsewhere with no data, and it will restore it's state off the
other replicas.

-Jay


On Wed, Jul 23, 2014 at 3:47 PM, Kam Kasravi
<kamkasr...@yahoo.com.invalid> wrote:
> Hi
>
> Kafka-on-yarn requires YARN to consistently allocate a kafka broker at a 
> particular resource since the broker needs to always use its local data. YARN 
> doesn't do this well, unless you provide (override) the default scheduler 
> (CapacityScheduler or FairScheduler). SequenceIO did something along these 
> lines for a different use case. Unfortunately replacing the scheduler is a 
> global operation which would affect all App masters. Additionally one could 
> argue that the broker should be run as an OS service and auto restarted on 
> failure if necessary. Slider (incubating) did some of this groundwork but 
> YARN still has lots of limitations in providing guarantees to consistently 
> allocate a container on a particular node especially on appmaster restart (eg 
> ResourceManager dies). That said, it might be worthwhile to enumerate all of 
> this here with some possible solutions. If there is interest I could 
> certainly list the relevant JIRA's along with some additional JIRA's
>  required IMO.
>
> Thanks
> Kam
>
>
> On Wednesday, July 23, 2014 2:37 PM, "hsy...@gmail.com" <hsy...@gmail.com> 
> wrote:
>
>
>
> Hi guys,
>
> Kafka is getting more and more popular and in most cases people run kafka
> as long-term service in the cluster. Is there a discussion of running kafka
> on yarn cluster which we can utilize the convenient configuration/resource
> management and HA.  I think there is a big potential and requirement for
> that.
> I found a project https://github.com/kkasravi/kafka-yarn. But is there a
> official roadmap/plan for this?
>
> Thank you very much!
>
> Best,
> Siyuan

Reply via email to