[ 
https://issues.apache.org/jira/browse/KAFKA-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13592884#comment-13592884
 ] 

Neha Narkhede commented on KAFKA-513:
-------------------------------------

Thanks for the patch! This patch is great and is very close to being checked 
in. Few review suggestions -

1. ControllerChannelManager
1.1 The log message when the controller sends the data includes the 
controller's epoch, but the response doesn't. It is useful to know the epoch 
wherever the controller id is logged.
1.2 The "," between controller id and epoch probably could be skipped, this 
pattern is not followed elsewhere in this patch.

2. Partition
2.1 The error statement when the broker is dropping the leaderAndIsr request 
should be -

"since local leader epoch %d is >= the request's leader epoch"

2.2 Also, let's add "Broker %d aborted..." to the following statement similar 
to "Broker %d discarded..."

"Aborted the become-follower state change since leader %d for partition [%s,%d]"

3. PartitionStateMachine
In electLeaderForPartition() API, it is useful to include the contents of the 
StateChangedFailedException in the state change log. This is because it is 
useful to know that after starting the state change, it got aborted because 
another controller with a higher controller epoch was detected. So something 
like -
"Controller %d epoch %d aborted leader election for partition [%s,%d] since ..."

4. ReplicaManager

The log statements here are of the form of [Replica Manager on Broker %d]: 
Handling... This is different from the ones in Partition which are like "Broker 
%d started become-follower...". I guess the default logIdent, which is meant 
for the main log, is the cause of this discrepancy. For the purpose of the 
state change log, we actually don't care if it is Replica Manager or Partition 
on that broker, it would just suffice to have the simplest statement to 
communicate the state changes like the ones in Partition except the logIdent.

5. ReplicaStateMachine
Same as 4

6. Rename stateChangeLogMerger.scala to have 1st letter in caps.

7. StateChangeLogMerger

This tool looks great now. Thanks for including the review suggestions. Few 
minor observations -

7.1 You check if at least one of the two log input options are specified. I 
think we should also check that at most one of them is specified. In other 
words, what's the expected behavior if both are speci
fied ?

                
> Add state change log to Kafka brokers
> -------------------------------------
>
>                 Key: KAFKA-513
>                 URL: https://issues.apache.org/jira/browse/KAFKA-513
>             Project: Kafka
>          Issue Type: Sub-task
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Swapnil Ghike
>            Priority: Blocker
>              Labels: p1, replication, tools
>             Fix For: 0.8
>
>         Attachments: kafka-513-v1.patch, kafka-513-v2.patch, 
> kafka-513-v3.patch, kafka-513-v4.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Once KAFKA-499 is checked in, every controller to broker communication can be 
> modelled as a state change for one or more partitions. Every state change 
> request will carry the controller epoch. If there is a problem with the state 
> of some partitions, it will be good to have a tool that can create a timeline 
> of requested and completed state changes. This will require each broker to 
> output a state change log that has entries like
> [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() for 
> partition [foo, 0] from controller 2, epoch 1
> [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() for 
> partition [foo, 0] from controller 2, epoch 1
> On controller, this will look like -
> [2012-09-10 10:06:17,198] controller 2, epoch 1, initiated state change 
> request LeaderAndIsr() for partition [foo, 0]
> We need a tool that can collect the state change log from all brokers and 
> create a per-partition timeline of state changes -
> [foo, 0]
> [2012-09-10 10:06:17,198] controller 2, epoch 1 initiated state change 
> request LeaderAndIsr() 
> [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() from 
> controller 2, epoch 1
> [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() from 
> controller 2, epoch 1
> This JIRA involves adding the state change log to each broker and adding the 
> tool to create the timeline 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to