Digging in a bit more, it appears that the "down" broker had likely partially failed. Thus, it was still attempting to fetch offsets that no longer exists. Does this make sense as an explanation of the above-mentioned behavior?
On Thu, Feb 5, 2015 at 10:58 AM, Kyle Banker <kyleban...@gmail.com> wrote: > Dug into this a bit more, and it turns out that we lost one of our 9 > brokers at the exact moment when this started happening. At the time that > we lost the broker, we had no under-replicated partitions. Since the broker > disappeared, we've had a fairly constant number of under replicated > partitions. This makes some sense, of course. > > Still, the log message doesn't. > > On Thu, Feb 5, 2015 at 10:39 AM, Kyle Banker <kyleban...@gmail.com> wrote: > >> I have a 9-node Kafka cluster, and all of the brokers just started >> spouting the following error: >> >> ERROR [Replica Manager on Broker 1]: Error when processing fetch request >> for partition [mytopic,57] offset 0 from follower with correlation id >> 58166. Possible cause: Request for offset 0 but we only have log segments >> in the range 39 to 39. (kafka.server.ReplicaManager) >> >> The "mytopic" topic has a replication factor of 3, and metrics are >> showing a large number of under replicated partitions. >> >> My assumption is that a log aged out but that the replicas weren't aware >> of it. >> >> In any case, this problem isn't fixing itself, and the volume of log >> messages of this type is enormous. >> >> What might have caused this? How does one resolve it? >> > >