Hi group,

We've been trying to track down a problem this morning for a little while, and 
thought I'd ask here while we keep looking.

We have 3 nodes (rep-3) running 8.1.1. We attempted a rolling upgrade yesterday 
to 8.2.1, and on the first node, after restarting, a single topic (a samza 
intermediate topic) started throwing replica fetcher errors over and over 
("NotLeaderForPartition"). There may or may not have been other things 
attempted at this time (not by me so I cannot say for sure). Anyway we ended up 
rolling back to 8.1.1 and ALL data had been DELETED from that node. It spent 
most of yesterday re-syncing, and came into sync last night, and a rebalance 
made everything run smoothly (*except for these damn replica fetcher errors for 
that one partition).

Today my colleague attempted the "unsupported" topic delete command for the 
"bad" partition, and bounced that one troublesome node.

Upon coming up, I can see in server.log that it is reading in all of the 
segments in, and then starts spitting out a samza topic fetch error, and 
through JMX the "ReplicaManager".LeaderCount is 0. It is not attempting to 
fetch or load any topics.

The other two brokers are showing under-replicated (obviously). What is going 
wrong? How can we get that samza topic really and truly gone? (if that is the 
cause of the broker not coming up)

Thanks,
Thunder

Reply via email to