[
https://issues.apache.org/jira/browse/KAFKA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16872476#comment-16872476
]
Sönke Liebau commented on KAFKA-5115:
-------------------------------------
Hi [~MiniMizer],
we've just discussed this today and while the change itself would be fairly
simple, I believe there are a lot of areas that would need investigating /
testing before this could be recommended for a production deployment.
Specifically everything around transactions and idempotent producers seem to me
to be worth a dedicated look.
On the consumer side, the immediate concern I think is offsets, stored offsets
might not create issues (but may also not work) - but anything cached inside
the Fetcher cause havoc..
Bottom line: it is a good idea that I'd fully support, but probably needs more
work than is immediately apparent.
> Use bootstrap.servers to refresh metadata
> -----------------------------------------
>
> Key: KAFKA-5115
> URL: https://issues.apache.org/jira/browse/KAFKA-5115
> Project: Kafka
> Issue Type: Improvement
> Affects Versions: 0.10.2.0
> Reporter: Dan
> Priority: Major
>
> Currently, it seems that bootstrap.servers list is used only when the
> producer starts, to discover the cluster, and subsequent metadata refreshes
> go to the discovered brokers directly.
> We would like to use the bootstrap.servers list for metadata refresh to
> support a failover mechanism by providing a VIP which can dynamically
> redirect requests to a secondary Kafka cluster if the primary is down.
> Consider the following use case, where "kafka-cluster.local" is a VIP on a
> load balancer with priority server pools that point to two different Kafka
> clusters (so when all servers of cluster #1 are down, it automatically
> redirects to servers from cluster #2).
> bootstrap.servers: kafka-cluster.local:9092
> 1) Producer starts, connects to kafka-cluster.local and discovers all servers
> from cluster #1
> 2) Producer starts producing to cluster #1
> 3) cluster #1 goes down
> 4) Producer detects the failure, refreshes metadata from kafka-cluster.local
> (which now returns nodes from cluster #2)
> 5) Producer starts producing to cluster #2
> 6) cluster #1 is brought back online, and kafka-cluster.local now points to
> it again
> In the current state, it seems that the producer will never revert to cluster
> #1 because it continues to refresh its metadata from the brokers of cluster
> #2, even though kafka-cluster.local no longer points to that cluster.
> If we could force the metadata refresh to happen against
> "kafka-cluster.local", it would enable automatic failover and failback
> between the clusters.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)