[ https://issues.apache.org/jira/browse/KAFKA-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neha Narkhede resolved KAFKA-513. --------------------------------- Resolution: Fixed Thanks for patch v5 Swapnil, the state change log looks great! I checked in that patch after the following minor changes - 1. Partition Fixed the error trace in makeFollower to include correlationId, controllerId and controllerEpoch 2. PartitionStateMachine 2.1 In initializeLeaderAndIsrForPartition(), Changed NEW -> New Changed ONLINE -> Online 2.2 In electLeaderForPartition(), Removed the duplicate "Controller %d epoch %d" 2.3 Removed the error from getLeaderIsrAndEpochOrThrowException() in state change log since it is already logged in the catch block of electLeaderForPartition() 2.4 Changed trace() to error() wherever required 3. ReplicaStateMachine Included an error statement in the state change log > Add state change log to Kafka brokers > ------------------------------------- > > Key: KAFKA-513 > URL: https://issues.apache.org/jira/browse/KAFKA-513 > Project: Kafka > Issue Type: Sub-task > Affects Versions: 0.8 > Reporter: Neha Narkhede > Assignee: Swapnil Ghike > Priority: Blocker > Labels: p1, replication, tools > Fix For: 0.8 > > Attachments: kafka-513-v1.patch, kafka-513-v2.patch, > kafka-513-v3.patch, kafka-513-v4.patch, kafka-513-v5-corrected.patch, > kafka-513-v5.patch > > Original Estimate: 96h > Remaining Estimate: 96h > > Once KAFKA-499 is checked in, every controller to broker communication can be > modelled as a state change for one or more partitions. Every state change > request will carry the controller epoch. If there is a problem with the state > of some partitions, it will be good to have a tool that can create a timeline > of requested and completed state changes. This will require each broker to > output a state change log that has entries like > [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() for > partition [foo, 0] from controller 2, epoch 1 > [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() for > partition [foo, 0] from controller 2, epoch 1 > On controller, this will look like - > [2012-09-10 10:06:17,198] controller 2, epoch 1, initiated state change > request LeaderAndIsr() for partition [foo, 0] > We need a tool that can collect the state change log from all brokers and > create a per-partition timeline of state changes - > [foo, 0] > [2012-09-10 10:06:17,198] controller 2, epoch 1 initiated state change > request LeaderAndIsr() > [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() from > controller 2, epoch 1 > [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() from > controller 2, epoch 1 > This JIRA involves adding the state change log to each broker and adding the > tool to create the timeline -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira