The brokers also seem unavailable while this is going on. Each of these log messages takes 2-3 seconds so at about 1200 partitions it takes up quite a bit of time. Ultimately it does recover though but sadly it goes down soon enough after I start sending it messages.
On Sat, Nov 22, 2014 at 11:23 AM, Rajiv Kurian <ra...@signalfuse.com> wrote: > A 3 node kafka broker cluster went down yesterday (all nodes) and I just > noticed it this morning. When I restarted it this morning, I see a lengthy > list of messages like this: > > Loading log 'mytopic-partitionNum" > Recovering unflushed segment 'some number' of in log mytopic-partitionNum. > Completed load of log mytopic-partitionNum with log end offset someOffset > > It's been going on for more than 30 minutes since I restarted the broker. > I have quite a few partitions (over 1000) but I still wouldn't expect it to > take such a long time. > > Any ideas on how I should investigate the problem? > > Thanks! >