[ https://issues.apache.org/jira/browse/KAFKA-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15208201#comment-15208201 ]
Simon Cooper commented on KAFKA-3296: ------------------------------------- {code}Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:1 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=uncompressed Topic: __consumer_offsets Partition: 0 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 1 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 2 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 3 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 4 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 5 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 6 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 7 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 8 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 9 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 10 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 11 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 12 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 13 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 14 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 15 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 16 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 17 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 18 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 19 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 20 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 21 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 22 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 23 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 24 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 25 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 26 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 27 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 28 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 29 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 30 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 31 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 32 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 33 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 34 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 35 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 36 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 37 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 38 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 39 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 40 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 41 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 42 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 43 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 44 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 45 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 46 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 47 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 48 Leader: none Replicas: 1 Isr: Topic: __consumer_offsets Partition: 49 Leader: none Replicas: 1 Isr:{code} Trying to produce or consume from __consumer_offsets produces {{[2016-03-23 10:31:07,515] WARN Error while fetching metadata with correlation id 0 : {test=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)}} > All consumer reads hang indefinately > ------------------------------------ > > Key: KAFKA-3296 > URL: https://issues.apache.org/jira/browse/KAFKA-3296 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.9.0.0, 0.9.0.1 > Reporter: Simon Cooper > Priority: Critical > Attachments: controller.zip, kafkalogs.zip > > > We've got several integration tests that bring up systems on VMs for testing. > We've recently upgraded to 0.9, and very occasionally we occasionally see an > issue where every consumer that tries to read from the broker hangs, spamming > the following in their logs: > {code}2016-02-26T12:25:37,856 | DEBUG | o.a.k.c.NetworkClient > [pool-10-thread-1] | Sending metadata request > ClientRequest(expectResponse=true, callback=null, > request=RequestSend(header={api_key=3,api_version=0,correlation_id=21905,client_id=consumer-1}, > body={topics=[Topic1]}), isInitiatedByNetworkClient, > createdTimeMs=1456489537856, sendTimeMs=0) to node 1 > 2016-02-26T12:25:37,856 | DEBUG | o.a.k.c.Metadata [pool-10-thread-1] | > Updated cluster metadata version 10954 to Cluster(nodes = [Node(1, > server.internal, 9092)], partitions = [Partition(topic = Topic1, partition = > 0, leader = 1, replicas = [1,], isr = [1,]]) > 2016-02-26T12:25:37,856 | DEBUG | o.a.k.c.c.i.AbstractCoordinator > [pool-10-thread-1] | Issuing group metadata request to broker 1 > 2016-02-26T12:25:37,857 | DEBUG | o.a.k.c.c.i.AbstractCoordinator > [pool-10-thread-1] | Group metadata response > ClientResponse(receivedTimeMs=1456489537857, disconnected=false, > request=ClientRequest(expectResponse=true, > callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler@28edb273, > > request=RequestSend(header={api_key=10,api_version=0,correlation_id=21906,client_id=consumer-1}, > body={group_id=}), createdTimeMs=1456489537856, sendTimeMs=1456489537856), > responseBody={error_code=15,coordinator={node_id=-1,host=,port=-1}}) > 2016-02-26T12:25:37,956 | DEBUG | o.a.k.c.NetworkClient [pool-10-thread-1] | > Sending metadata request ClientRequest(expectResponse=true, callback=null, > request=RequestSend(header={api_key=3,api_version=0,correlation_id=21907,client_id=consumer-1}, > body={topics=[Topic1]}), isInitiatedByNetworkClient, > createdTimeMs=1456489537956, sendTimeMs=0) to node 1 > 2016-02-26T12:25:37,956 | DEBUG | o.a.k.c.Metadata [pool-10-thread-1] | > Updated cluster metadata version 10955 to Cluster(nodes = [Node(1, > server.internal, 9092)], partitions = [Partition(topic = Topic1, partition = > 0, leader = 1, replicas = [1,], isr = [1,]]) > 2016-02-26T12:25:37,956 | DEBUG | o.a.k.c.c.i.AbstractCoordinator > [pool-10-thread-1] | Issuing group metadata request to broker 1 > 2016-02-26T12:25:37,957 | DEBUG | o.a.k.c.c.i.AbstractCoordinator > [pool-10-thread-1] | Group metadata response > ClientResponse(receivedTimeMs=1456489537957, disconnected=false, > request=ClientRequest(expectResponse=true, > callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler@40cee8cc, > > request=RequestSend(header={api_key=10,api_version=0,correlation_id=21908,client_id=consumer-1}, > body={group_id=}), createdTimeMs=1456489537956, sendTimeMs=1456489537956), > responseBody={error_code=15,coordinator={node_id=-1,host=,port=-1}}) > 2016-02-26T12:25:38,056 | DEBUG | o.a.k.c.NetworkClient [pool-10-thread-1] | > Sending metadata request ClientRequest(expectResponse=true, callback=null, > request=RequestSend(header={api_key=3,api_version=0,correlation_id=21909,client_id=consumer-1}, > body={topics=[Topic1]}), isInitiatedByNetworkClient, > createdTimeMs=1456489538056, sendTimeMs=0) to node 1 > 2016-02-26T12:25:38,056 | DEBUG | o.a.k.c.Metadata [pool-10-thread-1] | > Updated cluster metadata version 10956 to Cluster(nodes = [Node(1, > server.internal, 9092)], partitions = [Partition(topic = Topic1, partition = > 0, leader = 1, replicas = [1,], isr = [1,]]) > 2016-02-26T12:25:38,056 | DEBUG | o.a.k.c.c.i.AbstractCoordinator > [pool-10-thread-1] | Issuing group metadata request to broker 1 > 2016-02-26T12:25:38,057 | DEBUG | o.a.k.c.c.i.AbstractCoordinator > [pool-10-thread-1] | Group metadata response > ClientResponse(receivedTimeMs=1456489538057, disconnected=false, > request=ClientRequest(expectResponse=true, > callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler@439e25fb, > > request=RequestSend(header={api_key=10,api_version=0,correlation_id=21910,client_id=consumer-1}, > body={group_id=}), createdTimeMs=1456489538056, sendTimeMs=1456489538056), > responseBody={error_code=15,coordinator={node_id=-1,host=,port=-1}}){code} > This persists for any 0.9 consumer trying to read from the topic (we haven't > confirmed if this is for a single topic or for any topic on the broker). 0.8 > consumers can read from the broker without issues. This is fixed by a broker > restart. > This was observed on a single-broker cluster. There were no suspicious log > messages on the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)