Allen, which version of Kafka are you using? And if you have multiple brokers, is there a controller migration happened before?
Guozhang On Fri, Jan 23, 2015 at 3:56 PM, Allen Wang <aw...@netflix.com.invalid> wrote: > Hello, > > We tried the ReassignPartitionsCommand to move partitions to new brokers. > The execution initially showed message "Successfully started reassignment > of partitions ...". But when I tried to verify using --verify option, it > reported some reassignments have failed: > > ERROR: Assigned replicas (0,5,2) don't match the list of replicas for > reassignment (0,5) for partition [vhs_playback_event,1] > ERROR: Assigned replicas (4,5,0,2) don't match the list of replicas for > reassignment (4,5) for partition [vhs_playback_event,11] > ERROR: Assigned replicas (3,5,0,2) don't match the list of replicas for > reassignment (3,5) for partition [vhs_playback_event,16] > > I noticed that the assigned replicas in the error messages include both old > assignment and new assignment. Is this a real error or just means > partitions are being copied and current state does not match the final > expected state? > > Since I was confused by the errors, I ran the same > ReassignPartitionsCommand with the same assignment again but got some > additional failure messages complaining that three partitions do not exist: > > [2015-01-23 18:15:41,333] ERROR Skipping reassignment of partition > [vhs_playback_event,16] since it doesn't exist > (kafka.admin.ReassignPartitionsCommand) > [2015-01-23 18:15:41,455] ERROR Skipping reassignment of partition > [vhs_playback_event,15] since it doesn't exist > (kafka.admin.ReassignPartitionsCommand) > [2015-01-23 18:15:41,499] ERROR Skipping reassignment of partition > [vhs_playback_event,17] since it doesn't exist > (kafka.admin.ReassignPartitionsCommand) > > These partitions later reappeared from the output of --verify. > > The other thing is that at one point the BytesOut from one broker exceeds > 100Mbytes, which is quite alarming. > > In the end, the reassignment was done according to the input file to > ReassignPartitionsCommand. But the UnderReplicatedPartitions for the > brokers keeps showing a positive number, even though the output of describe > topic command and ZooKeeper data show the ISRs are all in sync, and > Replica-MaxLag is 0. > > To sum up, the overall execution is successful but the error messages are > quite noisy and the metric is not consistent with what appears to be. > > Does anyone have the similar experience and is there anything we can do get > it done smoother? What can we do to reset the inconsistent > UnderReplicatedPartitions metric? > > Thanks, > Allen > -- -- Guozhang