[ https://issues.apache.org/jira/browse/KAFKA-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Swapnil Ghike updated KAFKA-513: -------------------------------- Attachment: kafka-513-v2.patch Patch v2: 1.1.a Statements other than "abort transition" are logged in ReplicaManager. Also please refer to 3 below. 1.1.b Logging statements for controller seem to be of the format "controller sent a LeaderAndIsr request to broker %d", pls lmk if I should change them. Also pls refer to 2.1.b below. 1.2 Will KAFK-649 take care of this? Also, it seems like [topic, partition] is more common in our current code. 1.3 Made the change. 2.1.a I have created a separate file controller.log in log4j.properties. Earlier all the controller logging statements were sent to state-change.log 2.1.b Ignored all trace statements from state-change log. Should the debug statement in ControllerBrokerRequestBatch.sendRequestsToBrokers be in state-change.log? Ignored other debug statements from state-change log. 2.2 Made the change. 3 We should probably keep it, since it's nice to have the logIdent identifier at the beginning of each logging statement in state-change.log. Lmk what you think. 4. Made the change. 5. Rebase took care of this. 6.1 Changed the input name to 'stateChangeLog', hopefully the description is also clearer now. 6.2 I think it's ok if we merge everything together. Grepping for a topic or a partition is straightforward. 6.3 Hmm, so I tried to merge 6 files each of 11k lines. The total memory consumed did not rise above 200MB. To optimize a bit, I replaced the immutable.TreeMap used from the last patch with mutable.HashMap. These (date --> lines) hashmaps are sorted by converting each to a sequence before printing, sorting should be ok since each of such hashmaps will contain only a handful of entries for a single [topic, partition]. > Add state change log to Kafka brokers > ------------------------------------- > > Key: KAFKA-513 > URL: https://issues.apache.org/jira/browse/KAFKA-513 > Project: Kafka > Issue Type: Sub-task > Affects Versions: 0.8 > Reporter: Neha Narkhede > Assignee: Swapnil Ghike > Priority: Blocker > Labels: p1, replication, tools > Fix For: 0.8 > > Attachments: kafka-513-v1.patch, kafka-513-v2.patch > > Original Estimate: 96h > Remaining Estimate: 96h > > Once KAFKA-499 is checked in, every controller to broker communication can be > modelled as a state change for one or more partitions. Every state change > request will carry the controller epoch. If there is a problem with the state > of some partitions, it will be good to have a tool that can create a timeline > of requested and completed state changes. This will require each broker to > output a state change log that has entries like > [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() for > partition [foo, 0] from controller 2, epoch 1 > [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() for > partition [foo, 0] from controller 2, epoch 1 > On controller, this will look like - > [2012-09-10 10:06:17,198] controller 2, epoch 1, initiated state change > request LeaderAndIsr() for partition [foo, 0] > We need a tool that can collect the state change log from all brokers and > create a per-partition timeline of state changes - > [foo, 0] > [2012-09-10 10:06:17,198] controller 2, epoch 1 initiated state change > request LeaderAndIsr() > [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() from > controller 2, epoch 1 > [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() from > controller 2, epoch 1 > This JIRA involves adding the state change log to each broker and adding the > tool to create the timeline -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira