We are using 0.8.1.1. How do we identify controller migration? Is it in logs or some metrics?
Allen On Tue, Jan 27, 2015 at 9:35 AM, Guozhang Wang <wangg...@gmail.com> wrote: > Allen, which version of Kafka are you using? And if you have multiple > brokers, is there a controller migration happened before? > > Guozhang > > On Fri, Jan 23, 2015 at 3:56 PM, Allen Wang <aw...@netflix.com.invalid> > wrote: > > > Hello, > > > > We tried the ReassignPartitionsCommand to move partitions to new brokers. > > The execution initially showed message "Successfully started reassignment > > of partitions ...". But when I tried to verify using --verify option, it > > reported some reassignments have failed: > > > > ERROR: Assigned replicas (0,5,2) don't match the list of replicas for > > reassignment (0,5) for partition [vhs_playback_event,1] > > ERROR: Assigned replicas (4,5,0,2) don't match the list of replicas for > > reassignment (4,5) for partition [vhs_playback_event,11] > > ERROR: Assigned replicas (3,5,0,2) don't match the list of replicas for > > reassignment (3,5) for partition [vhs_playback_event,16] > > > > I noticed that the assigned replicas in the error messages include both > old > > assignment and new assignment. Is this a real error or just means > > partitions are being copied and current state does not match the final > > expected state? > > > > Since I was confused by the errors, I ran the same > > ReassignPartitionsCommand with the same assignment again but got some > > additional failure messages complaining that three partitions do not > exist: > > > > [2015-01-23 18:15:41,333] ERROR Skipping reassignment of partition > > [vhs_playback_event,16] since it doesn't exist > > (kafka.admin.ReassignPartitionsCommand) > > [2015-01-23 18:15:41,455] ERROR Skipping reassignment of partition > > [vhs_playback_event,15] since it doesn't exist > > (kafka.admin.ReassignPartitionsCommand) > > [2015-01-23 18:15:41,499] ERROR Skipping reassignment of partition > > [vhs_playback_event,17] since it doesn't exist > > (kafka.admin.ReassignPartitionsCommand) > > > > These partitions later reappeared from the output of --verify. > > > > The other thing is that at one point the BytesOut from one broker exceeds > > 100Mbytes, which is quite alarming. > > > > In the end, the reassignment was done according to the input file to > > ReassignPartitionsCommand. But the UnderReplicatedPartitions for the > > brokers keeps showing a positive number, even though the output of > describe > > topic command and ZooKeeper data show the ISRs are all in sync, and > > Replica-MaxLag is 0. > > > > To sum up, the overall execution is successful but the error messages are > > quite noisy and the metric is not consistent with what appears to be. > > > > Does anyone have the similar experience and is there anything we can do > get > > it done smoother? What can we do to reset the inconsistent > > UnderReplicatedPartitions metric? > > > > Thanks, > > Allen > > > > > > -- > -- Guozhang >