Re: Kafka Operational Wierdness

Eric Kolotyluk Wed, 13 Sep 2017 08:23:14 -0700

Some more detail on this issue:

On a hunch I tried restarting my docker-compose stack a few more times.Still same problem, my application using the Kafka Client APIs claims itis talking to Kafka, but the Kafka logs disagree.

So I restarted the stack once more. With 'docker-compose up' this is avery clean start. Then I waited about 10 minutes until I saw

kafka-1_1 | [2017-09-13 15:00:47,214] INFO [Group Metadata Manager onBroker 1001]: Removed 0 expired offsets in 0 milliseconds.(kafka.coordinator.GroupMetadataManager)kafka-3_1 | [2017-09-13 15:00:47,595] INFO [Group Metadata Manager onBroker 1002]: Removed 0 expired offsets in 0 milliseconds.(kafka.coordinator.GroupMetadataManager)kafka-2_1 | [2017-09-13 15:00:47,771] INFO [Group Metadata Manager onBroker 1003]: Removed 0 expired offsets in 0 milliseconds.(kafka.coordinator.GroupMetadataManager)

in the log. When I reran my application, suddenly the Kafka logs comealive with indications they are creating topics, et al.

I have been using Kafka for a couple months now, and this is very newbehavior I have not seen until a week ago. I am used to being able torun my application immediately after the Kafka Stack comes up in Docker.Operationally now it seems I have to wait 10 minutes after starting Kafka.

Of course I am still dealing with the NotLeaderForPartitionExceptionproblem, which is also new, and breaks my application, but at least Iseem to have a repeatable path to that problem.


Cheers, Eric


On 2017-09-12 2:43 PM, Eric Kolotyluk wrote:

The last few days I have been seeing a problem I do not know how toexplain.
For months I have been successfully running Kafka/Zookeeper underdocker, and my application seems to work fine. Lately, when I runKafka under either docker-compose on my developer system, or 'dockerstack deploy' on a Docker Swarm on AWS, here is what I am seeing:
According to the logs, Zookeeper/Kafka seem to start okay, and the 3brokers I have configured seem to find each other. The logs lookpretty normal. Then I start my application, and my application logsshow that it has connected to the Kafka Cluster okay, it indicatesthat it has created the topics okay. However, there is nothing in theKafka logs to show any kind of connection from my application, letalong topics being created. Sure enough, when I rerun my application,it cannot find the topics, it tries to create them again, and gets asuccessful response from the Kafka Admin Client. Nope, they were notcreated.
When I shut down Kafka, the logs show the shutdown sequence for allthe brokers and zookeeper. I cannot understand why the Kafka ClientLibrary is not showing any errors when the Kafka logs are not showingany connection or operations.
I tried both Kafka 0.11.0.0 and 0.10.2.1 -- same problem.
Been trying to figure out this problem all morning, bashing my headagainst the wall.
*Then I go to lunch*, and a couple hours later I try one more time.Behold, suddenly I can see the Kafka logs reporting they have createdthe topics my application requested. But now I am stuck with theinfamous org.apache.kafka.common.errors.NotLeaderForPartitionExceptionproblem again. This is another new problem that has started recently.Unfortunately I have wasted hours and hours fighting the first problemI have not been able to dig into this one.
What could possibly be the explanation for this not working, and thenworking again after a few hours?
It seems insanely difficult to operate a Kafka cluster in any kind ofstable configuration that does not fail randomly.
Can anyone offer any kind of advice on what the problem might be?
It it better to just give up trying to operate our own Kafka clusterand use Kinesis instead?
Cheers, Eric

Re: Kafka Operational Wierdness

Reply via email to