We need to improve how the metadata caching works in kafka. Currently, we have multiple places where we send the updated metadata to the individual brokers from the controller when the state of the metadata changes. This is hard to track. What we need to implement is to let the metadata structure in the controller to automatically send updates when a list of fields change in the metadata structure. This structure would be initialized on startup with the list of fields that need to trigger metadata update requests to the individual brokers. Once that is done, the structure would take care of reporting the state changes to the individual brokers. I will follow up on this with a JIRA for the next release.
On 7/11/13 10:53 AM, "Colin Blower" <cblo...@barracuda.com> wrote: >Hm... the cache may explain some odd behavior I was seeing in our >cluster yesterday. > >The zookeeper information for which nodes were In Sync Replicas was >different that the data I received from a metadata request response. >Zookeeper said two nodes were ISR and the metadata response said only >the leader was ISR. This was after one of our two nodes went down, but >when both were back up. > >I don't have the time to properly try to reproduce it, but it may be >something to think about if you are looking at caching issues. > > >On 07/10/2013 10:16 PM, Jun Rao wrote: >> That's actually not expected. We should only return live brokers to the >> client. It seems that we never clear the live broker cache in the >>brokers. >> This is a bug. Could you file a jira? >> >> Thanks, >> >> Jun >> >> >> On Wed, Jul 10, 2013 at 8:52 AM, Vinicius Carvalho < >> viniciusccarva...@gmail.com> wrote: >> >>> Hi there. Once again, I don't think I could get the docs on another >>>topic. >>> >>> So my nodejs client connects to the broker and the first thing it does >>>is >>> store the topic metadata: >>> >>> data received >>> { >>> "brokers": [ >>> { >>> "nodeId": 0, >>> "host": "10.139.245.106", >>> "port": 9092, >>> "byteLength": 24 >>> }, >>> { >>> "nodeId": 1, >>> "host": "localhost", >>> "port": 9093, >>> "byteLength": 19 >>> } >>> ], >>> "topicMetadata": [ >>> { >>> "topicErrorCode": 0, >>> "topicName": "foozbar", >>> "partitions": [ >>> { >>> "replicas": [ >>> 0 >>> ], >>> "isr": [ >>> 0 >>> ], >>> "partitionErrorCode": 0, >>> "partitionId": 0, >>> "leader": 0, >>> "byteLength": 26 >>> }, >>> { >>> "replicas": [ >>> 1 >>> ], >>> "isr": [ >>> 1 >>> ], >>> "partitionErrorCode": 0, >>> "partitionId": 1, >>> "leader": 1, >>> "byteLength": 26 >>> }, >>> { >>> "replicas": [ >>> 0 >>> ], >>> "isr": [ >>> 0 >>> ], >>> "partitionErrorCode": 0, >>> "partitionId": 2, >>> "leader": 0, >>> "byteLength": 26 >>> }, >>> { >>> "replicas": [ >>> 1 >>> ], >>> "isr": [ >>> 1 >>> ], >>> "partitionErrorCode": 0, >>> "partitionId": 3, >>> "leader": 1, >>> "byteLength": 26 >>> }, >>> { >>> "replicas": [ >>> 0 >>> ], >>> "isr": [ >>> 0 >>> ], >>> "partitionErrorCode": 0, >>> "partitionId": 4, >>> "leader": 0, >>> "byteLength": 26 >>> } >>> ], >>> "byteLength": 145 >>> } >>> ], >>> "responseSize": 200, >>> "correlationId": -1000 >>> } >>> >>> Ok, so far so good. So I kill node 0 on purpose. Trying to simulate a >>> broker failure, and then I fetch metadata again: >>> >>> data received >>> { >>> "brokers": [ >>> { >>> "nodeId": 0, >>> "host": "10.139.245.106", >>> "port": 9092, >>> "byteLength": 24 >>> }, >>> { >>> "nodeId": 1, >>> "host": "localhost", >>> "port": 9093, >>> "byteLength": 19 >>> } >>> ], >>> "topicMetadata": [ >>> { >>> "topicErrorCode": 0, >>> "topicName": "foozbar", >>> "partitions": [ >>> { >>> "replicas": [ >>> 0 >>> ], >>> "isr": [], >>> "partitionErrorCode": 5, >>> "partitionId": 0, >>> "leader": -1, >>> "byteLength": 22 >>> }, >>> { >>> "replicas": [ >>> 1 >>> ], >>> "isr": [ >>> 1 >>> ], >>> "partitionErrorCode": 0, >>> "partitionId": 1, >>> "leader": 1, >>> "byteLength": 26 >>> }, >>> { >>> "replicas": [ >>> 0 >>> ], >>> "isr": [], >>> "partitionErrorCode": 5, >>> "partitionId": 2, >>> "leader": -1, >>> "byteLength": 22 >>> }, >>> { >>> "replicas": [ >>> 1 >>> ], >>> "isr": [ >>> 1 >>> ], >>> "partitionErrorCode": 0, >>> "partitionId": 3, >>> "leader": 1, >>> "byteLength": 26 >>> }, >>> { >>> "replicas": [ >>> 0 >>> ], >>> "isr": [], >>> "partitionErrorCode": 5, >>> "partitionId": 4, >>> "leader": -1, >>> "byteLength": 22 >>> } >>> ], >>> "byteLength": 133 >>> } >>> ], >>> "responseSize": 188, >>> "correlationId": -1000 >>> } >>> >>> Well, I can see from partition metadata that some partions have no >>>leader >>> (-1), but my problem is that I actually rely on the brokers list to >>>create >>> a pool of connections. And even when the broker 0 is down, I still get >>>it >>> back from the metadata. Is this what is expected? I know it could be >>>that >>> brokers are only a list of all places where that topic could be found, >>>but >>> in that case couldn't we at least have a flag indicating wether that >>>broker >>> is online or not? >>> >>> Regards >>> >>> >>> >>> -- >>> The intuitive mind is a sacred gift and the >>> rational mind is a faithful servant. We have >>> created a society that honors the servant and >>> has forgotten the gift. >>> > > >-- >*Colin Blower* >/Software Engineer/ >Barracuda Networks Inc. >+1 408-342-5576 (o)