[ 
https://issues.apache.org/jira/browse/KAFKA-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630835#comment-13630835
 ] 

Jay Kreps commented on KAFKA-156:
---------------------------------

Scott--I am interested in implementing a deduplication scheme similar to what 
you propose. I think this would have several uses. This would definitely be 
post 0.8.

I do think we are conflating the storage location (disk versus memory) with 
deduplication/commit mechanism. I claim the scheme you propose just avoids 
duplicates and is unrelated to writing data to disk.

I have to say I am a little skeptical of the "fall back to disk thing". In our 
usage we have many thousands of servers and a small number of kafka servers 
with nice disks--I think this is fairly standard. The idea that involving 
thousands of crappy local disks in the data pipeline will decrease the 
empirical frequency of data loss seems dubious to me.

But regardless of whether you buffer on disk or buffer in memory (as the client 
currently does). As long as the client has sufficient space to buffer until the 
server is available again there is no data loss. And indeed the replication 
fail-over is very fast so this really does work. As you point out, though that 
does lead to the possibility of duplicate messages. Which is where you proposal 
comes in.

I had thought of a similar thing. Here was my idea:
1. Client provides a unique instance id for itself.
2. Each message contains the instance id and a per-client sequence number
3. Broker maintains a per-client highwater mark on the sequence number, 
periodically checkpointed to disk
4. In the event of a hard crash the broker rebuilds the highwater marks from 
the last checkpoint and the log
5. Broker discards any request containing a message from a client that has a 
sequence number less than or equal to the high-water mark.

The advantage of this approach would be that it doesn't require a multi-phase 
produce, the disadvantage is that it requires assigning client ids.

One question about your proposal. Let's say that the broker fails before 
sending the "Acknowledge Batch Commit", ownership of that partition fails over 
to another broker but that broker but the client doesn't know if the 
transaction was committed (and the broker died just before sending the ack) or 
was not committed. How can the producer then send to the other broker which 
won't have the same UUID info?
                
> Messages should not be dropped when brokers are unavailable
> -----------------------------------------------------------
>
>                 Key: KAFKA-156
>                 URL: https://issues.apache.org/jira/browse/KAFKA-156
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Sharad Agarwal
>             Fix For: 0.8
>
>
> When none of the broker is available, producer should spool the messages to 
> disk and keep retrying for brokers to come back.
> This will also enable brokers upgrade/maintenance without message loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to