[ 
https://issues.apache.org/jira/browse/KAFKA-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950609#comment-15950609
 ] 

Aegeaner commented on KAFKA-4975:
---------------------------------

You can enable unclean.leader.election.enable in config.

Unclean leader election: A follower goes down, in the meanwhile the leader 
keeps appending messages. The follower comes back up and before it has 
completely caught up with the leader's logs, all replicas in the ISR go down. 
The follower is now uncleanly elected as the new leader, and it starts 
appending messages from the client. The old leader comes back up, becomes a 
follower
and it may discover that the current leader's end offset is behind its own end 
offset.

> Kafka process is running, but not listening to 9092 port
> --------------------------------------------------------
>
>                 Key: KAFKA-4975
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4975
>             Project: Kafka
>          Issue Type: Bug
>          Components: network
>    Affects Versions: 0.10.1.1
>         Environment: A cluster of 15 Kafka brokers connected to a cluster of 
> 3 Zookeeper servers, all in the same data center.
> uname -a: Linux dc3-kafka-02 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 
> 19:39:52 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> Kafka brokers hardware specs:
> H/W path       Device     Class      Description
> ================================================
>                           system     SR ((^_^))
> /0                        bus        SR
> /0/0                      memory     128KiB BIOS
> /0/4                      processor  Intel(R) Atom(TM) CPU  C2750  @ 2.40GHz
> /0/4/5                    memory     448KiB L1 cache
> /0/4/6                    memory     4MiB L2 cache
> /0/15                     memory     16GiB System Memory
> /0/15/0                   memory     8GiB DIMM DDR3 Synchronous 1600 MHz (0.6 
> ns)
> /0/15/1                   memory     DIMM DDR3 Synchronous [empty]
> /0/15/2                   memory     8GiB DIMM DDR3 Synchronous 1600 MHz (0.6 
> ns)
> /0/15/3                   memory     DIMM DDR3 Synchronous [empty]
> /0/100                    bridge     Atom processor C2000 SoC Transaction 
> Router
> /0/100/f                  generic    Atom processor C2000 RCEC
> /0/100/13                 generic    Atom processor C2000 SMBus 2.0
> /0/100/14      enp0s20f0  network    Ethernet Connection I354 2.5 GbE 
> Backplane
> /0/100/14.1    enp0s20f1  network    Ethernet Connection I354 2.5 GbE 
> Backplane
> /0/100/16                 bus        Atom processor C2000 USB Enhanced Host 
> Controller
> /0/100/16/1    usb1       bus        EHCI Host Controller
> /0/100/16/1/1             bus        USB hub
> /0/100/18                 storage    Atom processor C2000 AHCI SATA3 
> Controller
> /0/100/1f                 bridge     Atom processor C2000 PCU
> /0/100/1f.3               bus        Atom processor C2000 PCU SMBus
> /0/101                    bridge     Atom processor C2000 RAS
> /0/1           scsi0      storage    
> /0/1/0.0.0     /dev/sda   disk       256GB SAMSUNG MZ7LN256
> /0/1/0.0.0/1   /dev/sda1  volume     190MiB EXT4 volume
> /0/1/0.0.0/2   /dev/sda2  volume     237GiB EXT4 volume
> /0/1/0.0.0/3   /dev/sda3  volume     976MiB Linux swap volume
> /1                        power      CRB Battery 0
> /2                        power      OEM Define 5
>            Reporter: Rafael Telles
>            Priority: Critical
>
> I have two clusters of Kafka brokers, one of them (with 15 brokers + 3 
> Zookeeper servers) became sick (a lot of under-replicated partitions, 
> throwing a lot of NotEnoughReplicasExceptions). I logged in some of the 
> brokers that other couldn't connect to, and I found out that they were all 
> running their Kafka process, but they were not listening to the default TCP 
> port (9092) as expected:
> root@dc3-kafka-02:/home/kafka/kafka_2.11-0.10.1.1# ps aux | grep kafka
> root     14055 21.6 33.6 23001236 5513176 ?    Sl   Mar23 1866:20 
> /usr/lib/jvm/java-8-oracle/bin/java -Xms2G -Xmx6G -server -XX:+UseG1GC 
> -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 
> -XX:+DisableExplicitGC -Djava.awt.headless=true 
> -Xloggc:/home/kafka/kafka_2.11-0.10.1.1/bin/../logs/kafkaServer-gc.log 
> -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.port=17264 
> -Dkafka.logs.dir=/home/kafka/kafka_2.11-0.10.1.1/bin/../logs 
> -Dlog4j.configuration=file:/home/kafka/kafka_2.11-0.10.1.1/bin/../config/log4j.properties
>  -cp 
> :/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/aopalliance-repackaged-2.4.0-b34.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/argparse4j-0.5.0.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/connect-api-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/connect-file-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/connect-json-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/connect-runtime-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/guava-18.0.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/hk2-api-2.4.0-b34.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/hk2-locator-2.4.0-b34.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/hk2-utils-2.4.0-b34.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jackson-annotations-2.6.0.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jackson-core-2.6.3.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jackson-databind-2.6.3.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jackson-jaxrs-base-2.6.3.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jackson-jaxrs-json-provider-2.6.3.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jackson-module-jaxb-annotations-2.6.3.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/javassist-3.18.2-GA.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/javax.annotation-api-1.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/javax.inject-1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/javax.inject-2.4.0-b34.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/javax.servlet-api-3.1.0.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/javax.ws.rs-api-2.0.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jersey-client-2.22.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jersey-common-2.22.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jersey-container-servlet-2.22.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jersey-container-servlet-core-2.22.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jersey-guava-2.22.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jersey-media-jaxb-2.22.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jersey-server-2.22.2.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-http-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-io-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-security-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-server-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jetty-util-9.2.15.v20160210.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/jopt-simple-4.9.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka_2.11-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka_2.11-0.10.1.1-sources.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka_2.11-0.10.1.1-test-sources.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka-clients-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka-log4j-appender-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka-streams-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka-streams-examples-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/kafka-tools-0.10.1.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/log4j-1.2.17.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/lz4-1.3.0.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/metrics-core-2.2.0.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/osgi-resource-locator-1.0.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/raven-7.8.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/raven-log4j-7.8.1.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/reflections-0.9.10.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/rocksdbjni-4.9.0.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/scala-library-2.11.8.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/scala-parser-combinators_2.11-1.0.4.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/slf4j-api-1.7.21.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/slf4j-log4j12-1.7.21.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/snappy-java-1.1.2.6.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/validation-api-1.1.0.Final.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/zkclient-0.9.jar:/home/kafka/kafka_2.11-0.10.1.1/bin/../libs/zookeeper-3.4.8.jar
>  kafka.Kafka /home/kafka/kafka_2.11-0.10.1.1/config/server.properties
> root     28615  0.0  0.0  14180  1024 pts/0    S+   13:35   0:00 grep 
> --color=auto kafka
> root@dc3-kafka-02:/home/kafka/kafka_2.11-0.10.1.1# netstat -tulpn | grep 9092
> ...returns empty
> If I restart Kafka in these brokers, they start listening to 9092 again.
> Update, I found this in the logs, (I restarted the broker, it started 
> listening to 9092, then it stopped):
> [2017-03-29 15:11:38,181] INFO Awaiting socket connections on xxx:9092. 
> (kafka.network.Acceptor)
> [2017-03-29 15:11:38,195] INFO [Socket Server on Broker 15], Started 1 
> acceptor threads (kafka.network.SocketServer)
> [2017-03-29 15:15:15,254] INFO [Socket Server on Broker 15], Shutting down 
> (kafka.network.SocketServer)
> [2017-03-29 15:15:15,357] INFO [Socket Server on Broker 15], Shutdown 
> completed (kafka.network.SocketServer)
> And there are these FATAL errors too:
> [2017-03-29 15:13:30,114] FATAL [ReplicaFetcherThread-0-7], Exiting because 
> log truncation is not allowed for partition __consumer_offsets-27, Current 
> leader 7's latest offset 0 is less than replica 15's latest offset 1734972 
> (kafka.server.ReplicaFetcherThread)
> [2017-03-29 15:13:30,114] FATAL [ReplicaFetcherThread-0-7], Exiting because 
> log truncation is not allowed for partition __consumer_offsets-27, Current 
> leader 7's latest offset 0 is less than replica 15's latest offset 1734972 
> (kafka.server.ReplicaFetcherThread)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to