[ https://issues.apache.org/jira/browse/KAFKA-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13954308#comment-13954308 ]
Jay Kreps commented on KAFKA-1348: ---------------------------------- Jay, the cluster maintains knowledge of its state as brokers join and leave as well as where each partition is currently hosted, who is the leader for each partition, etc. This is the information the client needs to direct its requests and it is a lot more than just what nodes are alive and in the cluster. You get this information by issuing a metadata request to any broker in the cluster. Once you are connected to the cluster the client issues metadata requests to find out about any cluster changes. The client automatically issues metadata requests at a fixed interval or any time it gets an error talking to a broker that might indicate stale metadata (e.g. a network exception, timeout, not leader exception, etc). So in your scenario, once connected, the clients discover the new brokers as they are added by sending metadata requests to the existing brokers. As soon as ec2-12-123-456-444.compute-1.amazonaws.com is leader for any partition the client needs to send data to it will discover the leadership change. It is true that if you suddenly killed 100% of the brokers in your cluster and replace them with 100% new brokers then there will be no one you know about who is left to tell you about the changes. The solution to this is not to kill 100% of your cluster all at once. So the problem that needs to be solved is the problem of bootstrapping knowledge of at least one active node in the cluster. If you knew this then you could use that node to find out where all the partitions are hosted so you could publish data. But how to find that out? The way we do this is by giving a comma separated list of bootstrap nodes that you can use for your initial bootstrap on startup. This is only used during initialization. After initialization all further metadata updates will use the full set of alive nodes. The problem that I understand is that if you already have some home-grown service discovery mechanism you may not want to configure any bootstrap urls directly, instead you may want to configure the url of your service discovery system to help make initial contact with a broker. This makes sense. But this contact will just be used during initialization to bootstrap acquiring full metadata about partition assignment. Hopefully that makes sense. > Producer's Broker Discovery Interface > ------------------------------------- > > Key: KAFKA-1348 > URL: https://issues.apache.org/jira/browse/KAFKA-1348 > Project: Kafka > Issue Type: Improvement > Components: producer > Reporter: Jay Bae > Assignee: Jun Rao > > Producer has a property 'broker.list' static configuration. I need a > requirement to be able to override this behavior such as Netflix Eureka > Discovery module. Let me contribute and please add this to 0.8.1.1 release. -- This message was sent by Atlassian JIRA (v6.2#6252)