>. Could a fetch request from a consumer cause a Topic creation request (seems implausible).....
Yes, that seems like a way the broker can get into this situation. Thanks, Neha On Tue, Nov 5, 2013 at 4:58 PM, Jason Rosenberg <j...@squareup.com> wrote: > I don't know if I have a way to see the access logs on the LB......(still > trying to track that down). > > One thing I do see though, is that there are fetch requests from consumers, > that are then followed by these Topic creation log messages, e.g. (replaced > some of the specific strings in this log line :)) > > 2013-11-04 23:28:07,828 WARN [kafka-request-handler-1] server.KafkaApis - > [KafkaApi-10] Fetch request with correlation id 0 from client > > my-conumser-app-ConsumerFetcherThread-my-consumer-app_myhost.mydc.mycompany-1383602435614-2397d1ab-0-10 > on partition [mytopic,0] failed due to Partition [mytopic,0] doesn't exist > on 10 > 2013-11-04 23:28:07,847 INFO [kafka-request-handler-7] admin.AdminUtils$ - > Topic creation { "partitions":{ "0":[ 10, 9 ] }, "version":1 } > > It's hard to say whether there's a correlation between these 2 log lines > (since the 'Topic creation' log line doesn't include the name of the topic. > Shouldn't it?). > > Could a fetch request from a consumer cause a Topic creation request (seems > implausible)..... > > Jason > > > On Tue, Nov 5, 2013 at 9:51 AM, Neha Narkhede <neha.narkh...@gmail.com > >wrote: > > > Ok, so this can happen, even if the node has not been placed back into > > rotation, at the metadata vip? > > > > Hmmm... if the broker is not placed in the metadata vip, how did it end > up > > receiving metadata requests? You may want to investigate that by checking > > the public access logs. > > > > Thanks, > > Neha > > On Nov 4, 2013 9:41 PM, "Jason Rosenberg" <j...@squareup.com> wrote: > > > > > Ok, so this can happen, even if the node has not been placed back into > > > rotation, at the metadata vip? > > > > > > > > > On Tue, Nov 5, 2013 at 12:11 AM, Neha Narkhede < > neha.narkh...@gmail.com > > > >wrote: > > > > > > > It is probably due to the same metadata propagation issue. > > > > https://issues.apache.org/jira/browse/KAFKA-1111 should fix that. > > > > > > > > Thanks, > > > > Neha > > > > > > > > > > > > On Mon, Nov 4, 2013 at 11:47 AM, Jason Rosenberg <j...@squareup.com> > > > wrote: > > > > > > > > > Ok, > > > > > > > > > > After adding a delay before enabling a freshly started broker in > the > > > > > metadata vip that clients use, it seems to have drastically reduced > > the > > > > > number of these topic creation requests. However, not all of them. > > > > > > > > > > I still see (only some times) a handful of "Topic creation" log > > > messages, > > > > > that happen on a freshly started broker, during the time kafka has > > > > started, > > > > > but before the broker has been enabled in the metadata VIP. Is > there > > > an > > > > > explanation for this? > > > > > > > > > > Thanks! > > > > > > > > > > Jason > > > > > > > > > > > > > > > On Fri, Nov 1, 2013 at 4:45 PM, Neha Narkhede < > > neha.narkh...@gmail.com > > > > > >wrote: > > > > > > > > > > > The mbeans are explained here - > > > > > > http://kafka.apache.org/documentation.html#monitoring. Look for > > > > > > *QueueTimeMs > > > > > > > > > > > > Thanks, > > > > > > Neha > > > > > > > > > > > > > > > > > > On Fri, Nov 1, 2013 at 12:14 PM, Jason Rosenberg < > j...@squareup.com > > > > > > > > wrote: > > > > > > > > > > > > > Neha, > > > > > > > > > > > > > > This cluster has on the order of 750 topics. > > > > > > > > > > > > > > It looks like if I add a 20 second delay before placing a > broker > > > into > > > > > the > > > > > > > vip for metadata requests, it never seems to have this issue. > So > > > I'm > > > > > not > > > > > > > sure about the 104 seconds number, other than that was how long > > the > > > > > flood > > > > > > > of "Topic creation" log messages went on for (over 500 of > these). > > > > > > > > > > > > > > Which metric should I look at for 'high queue time'? > > > > > > > > > > > > > > Jason > > > > > > > > > > > > > > > > > > > > > On Fri, Nov 1, 2013 at 5:54 AM, Neha Narkhede < > > > > neha.narkh...@gmail.com > > > > > > > >wrote: > > > > > > > > > > > > > > > It is proportional to the number of topics but still seems > too > > > > long. > > > > > > Did > > > > > > > > the broker have high queue time on all requests? Also how > many > > > > topics > > > > > > > > existed on this cluster? > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Neha > > > > > > > > On Oct 31, 2013 10:56 PM, "Jason Rosenberg" < > j...@squareup.com> > > > > > wrote: > > > > > > > > > > > > > > > > > In this case, it appears to have gone on for 104 seconds. > > > Should > > > > > it > > > > > > > take > > > > > > > > > that long? It doesn't seem to always take this long. I > > guess > > > > one > > > > > of > > > > > > > the > > > > > > > > > drawbacks of having a large number of topics (brings out > the > > > edge > > > > > > > cases). > > > > > > > > > > > > > > > > > > Filed: https://issues.apache.org/jira/browse/KAFKA-1111 > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Nov 1, 2013 at 1:17 AM, Neha Narkhede < > > > > > > neha.narkh...@gmail.com > > > > > > > > > >wrote: > > > > > > > > > > > > > > > > > > > >> Would it be possible for the broker to make sure all > > > > metadata > > > > > is > > > > > > > > > > propagated, before completing the startup sequence, and > > > taking > > > > on > > > > > > > > > metadata > > > > > > > > > > requests? > > > > > > > > > > > > > > > > > > > > Yes, that is the flaw I mentioned before. We need to fix > > it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Oct 31, 2013 at 9:56 PM, Jason Rosenberg < > > > > > j...@squareup.com > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Ok, > > > > > > > > > > > > > > > > > > > > > > So, I can safely ignore these, it sounds like. I don't > > see > > > > any > > > > > > > > > > > corresponding logging around it subsequently not > > succeeding > > > > to > > > > > > > > actually > > > > > > > > > > > create the topic in zk. > > > > > > > > > > > > > > > > > > > > > > Would it be possible for the broker to make sure all > > > metadata > > > > > is > > > > > > > > > > > propagated, before completing the startup sequence, and > > > > taking > > > > > on > > > > > > > > > > metadata > > > > > > > > > > > requests? > > > > > > > > > > > > > > > > > > > > > > If I were to not place the newly started broker back > into > > > the > > > > > vip > > > > > > > > for a > > > > > > > > > > > delay period after the kafka broker has started, would > > that > > > > > solve > > > > > > > > this > > > > > > > > > > > issue? > > > > > > > > > > > > > > > > > > > > > > Jason > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Nov 1, 2013 at 12:39 AM, Jun Rao < > > jun...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > This is probably because some metadata requests were > > > issued > > > > > to > > > > > > a > > > > > > > > > newly > > > > > > > > > > > > started broker before the controller has propagated > the > > > > > > metadata. > > > > > > > > > Since > > > > > > > > > > > the > > > > > > > > > > > > broker doesn't have the metadata in the cache, it > > thinks > > > > the > > > > > > > topic > > > > > > > > > > > doesn't > > > > > > > > > > > > exist and tries to create the topic. The creation > will > > > fail > > > > > > > though > > > > > > > > > > since > > > > > > > > > > > > it's already registered in ZK. These INFO logging > > should > > > be > > > > > > > > transient > > > > > > > > > > > > though. > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Oct 31, 2013 at 8:41 PM, Jason Rosenberg < > > > > > > > j...@squareup.com > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Some times during a rolling restart, I see lots of > > > these > > > > > > > messages > > > > > > > > > > > during > > > > > > > > > > > > > the restart (these happened in the logs of the 2nd > > > server > > > > > in > > > > > > > the > > > > > > > > > > > cluster > > > > > > > > > > > > to > > > > > > > > > > > > > restart (broker 11), after having restarted > started, > > > and > > > > > > during > > > > > > > > the > > > > > > > > > > > time > > > > > > > > > > > > > the 3rd node (broker 9) is doing a controlled > > shutdown > > > > > before > > > > > > > > > > > stopping). > > > > > > > > > > > > > > > > > > > > > > > > > > Is it creating new topics (it shouldn't have > been!). > > > Or > > > > is > > > > > > it > > > > > > > > > moving > > > > > > > > > > > > > topics around, or what? > > > > > > > > > > > > > > > > > > > > > > > > > > 2013-11-01 03:25:55,661 INFO > > [kafka-request-handler-7] > > > > > > > > > > > > admin.AdminUtils$ - > > > > > > > > > > > > > Topic creation { "partitions":{ "0":[ 9, 11 ] }, > > > > > "version":1 > > > > > > } > > > > > > > > > > > > > 2013-11-01 03:26:01,577 INFO > > [kafka-request-handler-0] > > > > > > > > > > > > admin.AdminUtils$ - > > > > > > > > > > > > > Topic creation { "partitions":{ "0":[ 9, 10 ] }, > > > > > "version":1 > > > > > > } > > > > > > > > > > > > > 2013-11-01 03:26:07,767 INFO > > [kafka-request-handler-1] > > > > > > > > > > > > admin.AdminUtils$ - > > > > > > > > > > > > > Topic creation { "partitions":{ "0":[ 9, 11 ] }, > > > > > "version":1 > > > > > > } > > > > > > > > > > > > > 2013-11-01 03:26:07,823 INFO > > [kafka-request-handler-1] > > > > > > > > > > > > admin.AdminUtils$ - > > > > > > > > > > > > > Topic creation { "partitions":{ "0":[ 10, 11 ] }, > > > > > > "version":1 } > > > > > > > > > > > > > 2013-11-01 03:26:11,183 INFO > > [kafka-request-handler-2] > > > > > > > > > > > > admin.AdminUtils$ - > > > > > > > > > > > > > Topic creation { "partitions":{ "0":[ 10, 11 ] }, > > > > > > "version":1 } > > > > > > > > > > > > > 20 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >