Hi Daniel,

Your understanding is mostly correct, a few notes to add:

1. fsync is done in a backend thread asynchronously, so event setting
log.flush.interval.messages = 1 will not guarantee that when the producer
request ack back, the messages is definitely on the disk now. But since you
are using RF = 3 and acks = -1, as long as all replicas are still in ISR
the produce response guarantees that the message has been replicated on all
replicas.

2. If your case has a high produce load which may easily cause follower
replicas to drop out of ISR, you may want to tune some replica.lag.*
configs (details can be found below):

http://kafka.apache.org/documentation.html#brokerconfigs

Guozhang


On Tue, Jul 15, 2014 at 6:35 PM, Daniel Compton <d...@danielcompton.net>
wrote:

> I think I know the answer to this already but I wanted to check my
> assumptions before proceeding.
>
> We are using Kafka as a queueing mechanism for receiving messages from
> stateless producers. We are operating in a legal framework where we can
> never lose a committed message, but we can reject a write if Kafka is
> unavailable and it will be retried in the future. We are operating all of
> our servers in one rack so we are vulnerable if a whole rack goes out. We
> will have 3-4 Kafka brokers and have RF=3
>
> To guarantee that we never (to the greatest extent possible) lose a
> message that we have acknowledged, it seems like we need to have
> request.required.acks=-1 and log.flush.interval.messages = 1, i.e. fsync on
> every message and wait for all brokers in ISR to reply before returning
> successfully. This would guard against the failure scenario where all
> servers in our rack go down simultaneously.
>
> Is my understanding correct?
>
> Thanks, Daniel.
>
>


-- 
-- Guozhang

Reply via email to