Hey guys, another small issue to report for 0.8.1.  After a couple days 3
of my brokers had fallen off the ISR list for a 2-3 of their partitions.

I didn't see anything unusual in the log and I just restarted one.  It came
up fine but as it loaded its logs I these messages showed up:

[2013-12-21 19:25:19,968] WARN [ReplicaFetcherThread-0-2], Replica 1 for
partition [Events2,58] reset its fetch offset to current leader 2's start
offset 1042738519 (kafka.server.ReplicaFetcherThread)
[2013-12-21 19:25:19,969] WARN [ReplicaFetcherThread-0-14], Replica 1 for
partition [Events2,28] reset its fetch offset to current leader 14's start
offset 1043415514 (kafka.server.ReplicaFetcherThread)
[2013-12-21 19:25:20,012] WARN [ReplicaFetcherThread-0-2], Current offset
1011209589 for partition [Events2,58] out of range; reset offset to
1042738519 (kafka.server.ReplicaFetcherThread)
[2013-12-21 19:25:20,013] WARN [ReplicaFetcherThread-0-14], Current offset
1010086751 for partition [Events2,28] out of range; reset offset to
1043415514 (kafka.server.ReplicaFetcherThread)
[2013-12-21 19:25:20,036] WARN [ReplicaFetcherThread-0-14], Replica 1 for
partition [Events2,71] reset its fetch offset to current leader 14's start
offset 1026871415 (kafka.server.ReplicaFetcherThread)
[2013-12-21 19:25:20,036] WARN [ReplicaFetcherThread-0-2], Replica 1 for
partition [Events2,44] reset its fetch offset to current leader 2's start
offset 1052372907 (kafka.server.ReplicaFetcherThread)
[2013-12-21 19:25:20,036] WARN [ReplicaFetcherThread-0-14], Current offset
993879706 for partition [Events2,71] out of range; reset offset to
1026871415 (kafka.server.ReplicaFetcherThread)
[2013-12-21 19:25:20,036] WARN [ReplicaFetcherThread-0-2], Current offset
1020715056 for partition [Events2,44] out of range; reset offset to
1052372907 (kafka.server.ReplicaFetcherThread)

Judging by the network traffic and disk usage changes after the reboot
(both jumped up) a couple of the partition replicas had fallen behind and
are now catching up.


On Thu, Dec 19, 2013 at 4:37 PM, Neha Narkhede <neha.narkh...@gmail.com>wrote:

> Hi Drew,
>
> That problem will be fixed by
> https://issues.apache.org/jira/browse/KAFKA-1074. I think we are close to
> checking that in to trunk.
>
> Thanks,
> Neha
>
>
> On Wed, Dec 18, 2013 at 9:02 AM, Drew Goya <d...@gradientx.com> wrote:
>
> > Thanks Neha, I rolled upgrades and completed a rebalance!
> >
> > I ran into a few small issues I figured I would share.
> >
> > On a few Brokers, there were some log directories left over from some
> > failed rebalances which prevented the 0.8.1 brokers from starting once I
> > completed the upgrade.  These directories contained an index file and a
> > zero size log file, once I cleaned those out the brokers were able to
> start
> > up fine.  If anyone else runs into the same problem, and is running RHEL,
> > this is the bash script I used to clean them out:
> >
> > du --max-depth=1 -h /data/kafka/logs | grep K | sed s/.*K.// | sudo rm -r
> >
> >
> > On Tue, Dec 17, 2013 at 10:42 AM, Neha Narkhede <neha.narkh...@gmail.com
> > >wrote:
> >
> > > There are no compatibility issues. You can roll upgrades through the
> > > cluster one node at a time.
> > >
> > > Thanks
> > > Neha
> > >
> > >
> > > On Tue, Dec 17, 2013 at 9:15 AM, Drew Goya <d...@gradientx.com> wrote:
> > >
> > > > So I'm going to be going through the process of upgrading a cluster
> > from
> > > > 0.8.0 to the trunk (0.8.1).
> > > >
> > > > I'm going to be expanding this cluster several times and the problems
> > > with
> > > > reassigning partitions in 0.8.0 mean I have to move to trunk(0.8.1)
> > asap.
> > > >
> > > > Will it be safe to roll upgrades through the cluster one by one?
> > > >
> > > > Also are there any client compatibility issues I need to worry about?
> > >  Am I
> > > > going to need to pause/upgrade all my consumers/producers at once or
> > can
> > > I
> > > > roll upgrades through the cluster and then upgrade my clients one by
> > one?
> > > >
> > > > Thanks in advance!
> > > >
> > >
> >
>

Reply via email to