Hi Shikhar, I do not see stderr log file anywhere. Can you point me to where kafka would write such a file?
On Thu, Jun 30, 2016 at 5:10 PM, Shikhar Bhushan <shik...@confluent.io> wrote: > Perhaps it's a JVM crash? You might not see anything in the standard > application-level logs, you'd need to look for the stderr. > > On Thu, Jun 30, 2016 at 5:07 PM allen chan <allen.michael.c...@gmail.com> > wrote: > > > Anyone else have ideas? > > > > This is still happening. I moved off zookeeper from the server to its own > > dedicated VMs. > > Kakfa starts with 4G of heap and gets nowhere near that much consumed > when > > it crashed. > > i bumped up the zookeeper timeout settings but that has not solved it. > > > > I also disconnected all the producers and consumers. This point something > > between kafka and zookeeper right? > > > > Again logs are no help as to why kafka decided to shut itself down > > https://gist.github.com/allenmchan/f9331e54bb4fd77cc5bc0b031a7a6206 > > > > > > > > > > On Thu, Jun 2, 2016 at 4:22 PM, Russ Lavoie <russlav...@gmail.com> > wrote: > > > > > What about in dmesg? I have run into this issue and it was the OOM > > > killer. I also ran into a heap issue using too much of the direct > memory > > > (JVM). Reducing the fetcher threads helped with that problem. > > > On Jun 2, 2016 12:19 PM, "allen chan" <allen.michael.c...@gmail.com> > > > wrote: > > > > > > > Hi Tom, > > > > > > > > That is one of the first things that i checked. Active memory never > > goes > > > > above 50% of overall available. File cache uses the rest of the > memory > > > but > > > > i do not think that causes OOM killer. > > > > Either way there is no entries in /var/log/messages (centos) to show > > OOM > > > is > > > > happening. > > > > > > > > Thanks > > > > > > > > On Thu, Jun 2, 2016 at 5:36 AM, Tom Crayford <tcrayf...@heroku.com> > > > wrote: > > > > > > > > > That looks like somebody is killing the process. I'd suspect either > > the > > > > > linux OOM killer or something else automatically killing the JVM > for > > > some > > > > > reason. > > > > > > > > > > For the OOM killer, assuming you're on ubuntu, it's pretty easy to > > find > > > > in > > > > > /var/log/syslog (depending on your setup). I don't know about other > > > > > operating systems. > > > > > > > > > > On Thu, Jun 2, 2016 at 5:54 AM, allen chan < > > > allen.michael.c...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > I have an issue where my brokers would randomly shut itself down. > > > > > > I turned on debug in log4j.properties but still do not see a > reason > > > why > > > > > the > > > > > > shutdown is happening. > > > > > > > > > > > > Anyone seen this behavior before? > > > > > > > > > > > > version 0.10.0 > > > > > > log4j.properties > > > > > > log4j.rootLogger=DEBUG, kafkaAppender > > > > > > * I tried TRACE level but i do not see any additional log > messages > > > > > > > > > > > > snippet of log around shutdown > > > > > > [2016-06-01 15:11:51,374] DEBUG Got ping response for sessionid: > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn) > > > > > > [2016-06-01 15:11:53,376] DEBUG Got ping response for sessionid: > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn) > > > > > > [2016-06-01 15:11:55,377] DEBUG Got ping response for sessionid: > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn) > > > > > > [2016-06-01 15:11:57,380] DEBUG Got ping response for sessionid: > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn) > > > > > > [2016-06-01 15:11:59,383] DEBUG Got ping response for sessionid: > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn) > > > > > > [2016-06-01 15:12:01,386] DEBUG Got ping response for sessionid: > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn) > > > > > > [2016-06-01 15:12:03,389] DEBUG Got ping response for sessionid: > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn) > > > > > > [2016-06-01 15:12:04,121] INFO [Group Metadata Manager on Broker > > 2]: > > > > > > Removed 0 expired offsets in 0 milliseconds. > > > > > > (kafka.coordinator.GroupMetadataManager) > > > > > > [2016-06-01 15:12:04,121] INFO [Group Metadata Manager on Broker > > 2]: > > > > > > Removed 0 expired offsets in 0 milliseconds. > > > > > > (kafka.coordinator.GroupMetadataManager) > > > > > > [2016-06-01 15:12:05,390] DEBUG Got ping response for sessionid: > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn) > > > > > > [2016-06-01 15:12:07,393] DEBUG Got ping response for sessionid: > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn) > > > > > > [2016-06-01 15:12:09,396] DEBUG Got ping response for sessionid: > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn) > > > > > > [2016-06-01 15:12:11,399] DEBUG Got ping response for sessionid: > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn) > > > > > > [2016-06-01 15:12:13,334] INFO [Kafka Server 2], shutting down > > > > > > (kafka.server.KafkaServer) > > > > > > [2016-06-01 15:12:13,334] INFO [Kafka Server 2], shutting down > > > > > > (kafka.server.KafkaServer) > > > > > > [2016-06-01 15:12:13,336] INFO [Kafka Server 2], Starting > > controlled > > > > > > shutdown (kafka.server.KafkaServer) > > > > > > [2016-06-01 15:12:13,336] INFO [Kafka Server 2], Starting > > controlled > > > > > > shutdown (kafka.server.KafkaServer) > > > > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name > > > > > connections-closed: > > > > > > (org.apache.kafka.common.metrics.Metrics) > > > > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name > > > > > connections-created: > > > > > > (org.apache.kafka.common.metrics.Metrics) > > > > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name > > > > > bytes-sent-received: > > > > > > (org.apache.kafka.common.metrics.Metrics) > > > > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name > bytes-sent: > > > > > > (org.apache.kafka.common.metrics.Metrics) > > > > > > [2016-06-01 15:12:13,339] DEBUG Added sensor with name > > > bytes-received: > > > > > > (org.apache.kafka.common.metrics.Metrics) > > > > > > [2016-06-01 15:12:13,339] DEBUG Added sensor with name > select-time: > > > > > > (org.apache.kafka.common.metrics.Metrics) > > > > > > > > > > > > -- > > > > > > Allen Michael Chan > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Allen Michael Chan > > > > > > > > > > > > > > > -- > > Allen Michael Chan > > > -- Allen Michael Chan