[ https://issues.apache.org/jira/browse/KAFKA-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Björn Eriksson updated KAFKA-5546: ---------------------------------- Affects Version/s: 0.11.0.0 Environment: docker, failing-network Description: We've noticed that if the leaders networking is deconfigured (with {{ifconfig eth0 down}}) the producer won't notice this and doesn't immediately connect to the newly elected leader. {{docker-compose.yml}} and test runner are at https://github.com/owbear/kafka-network-failure-tests. We were expecting a transparent failover to the new leader but testing shows that there's a 8-15 seconds long gap where no values are stored in the log after the network is taken down. Tests (and results) [against 0.10.2.1|https://github.com/owbear/kafka-network-failure-tests/tree/kafka-network-failure-tests-0.10.2.1] Tests (and results) [against 0.11.0.0|https://github.com/owbear/kafka-network-failure-tests/tree/kafka-network-failure-tests-0.11.0.0] was: We've noticed that if the leaders networking is deconfigured (with {{ifconfig eth0 down}}) the producer won't notice this and doesn't immediately connect to the newly elected leader. {{docker-compose.yml}} and test runner are at https://github.com/owbear/kafka-network-failure-tests with sample test output at https://github.com/owbear/kafka-network-failure-tests/blob/master/README.md#sample-results I was expecting a transparent failover to the new leader. The attached log shows that while the producer produced values between {{12:37:33}} and {{12:37:54}}, theres a gap between {{12:37:41}} and {{12:37:50}} where no values was stored in the log after the network was taken down at {{12:37:42}}. Summary: Temporary loss of availability data when the leader is disconnected (was: Lost data when the leader is disconnected.) > Temporary loss of availability data when the leader is disconnected > ------------------------------------------------------------------- > > Key: KAFKA-5546 > URL: https://issues.apache.org/jira/browse/KAFKA-5546 > Project: Kafka > Issue Type: Bug > Components: producer > Affects Versions: 0.10.2.1, 0.11.0.0 > Environment: docker, failing-network > Reporter: Björn Eriksson > > We've noticed that if the leaders networking is deconfigured (with {{ifconfig > eth0 down}}) the producer won't notice this and doesn't immediately connect > to the newly elected leader. > {{docker-compose.yml}} and test runner are at > https://github.com/owbear/kafka-network-failure-tests. > We were expecting a transparent failover to the new leader but testing shows > that there's a 8-15 seconds long gap where no values are stored in the log > after the network is taken down. > Tests (and results) [against > 0.10.2.1|https://github.com/owbear/kafka-network-failure-tests/tree/kafka-network-failure-tests-0.10.2.1] > Tests (and results) [against > 0.11.0.0|https://github.com/owbear/kafka-network-failure-tests/tree/kafka-network-failure-tests-0.11.0.0] -- This message was sent by Atlassian JIRA (v6.4.14#64029)