[
https://issues.apache.org/jira/browse/KAFKA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249255#comment-14249255
]
Jay Kreps commented on KAFKA-1788:
----------------------------------
Currently the producer supports either blocking or dropping when it cannot send
to the cluster as fast as data is arriving. This could occur because the
cluster is down, or just because it isn't fast enough to keep up.
Kafka provides high availability for partitions so the case where a partition
is permanently unavailable should be rare.
Timing out requests might be nice, but it's not 100% clear that is better than
the current strategy. The current strategy is just to buffer as long as
possible and then either block or drop data when the buffer is exhausted.
Arguably dropping when you are out of space is better than dropping after a
fixed time (since in any case you have to drop when you are out of space).
As Ewen says we can't reset the metadata because the bootstrap servers may no
longer exist and if they do they are by definition a subset of the current
cluster metadata. I think Ewen solution of just making sure leastLoadedNode
eventually tries all nodes is the right way to go. We'll have to be careful,
though, as that method is pretty constrained.
> producer record can stay in RecordAccumulator forever if leader is no
> available
> -------------------------------------------------------------------------------
>
> Key: KAFKA-1788
> URL: https://issues.apache.org/jira/browse/KAFKA-1788
> Project: Kafka
> Issue Type: Bug
> Components: core, producer
> Affects Versions: 0.8.2
> Reporter: Jun Rao
> Assignee: Jun Rao
> Labels: newbie++
> Fix For: 0.8.3
>
>
> In the new producer, when a partition has no leader for a long time (e.g.,
> all replicas are down), the records for that partition will stay in the
> RecordAccumulator until the leader is available. This may cause the
> bufferpool to be full and the callback for the produced message to block for
> a long time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)