[ 
https://issues.apache.org/jira/browse/KAFKA-739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps updated KAFKA-739:
----------------------------

    Attachment: KAFKA-739-v2.patch

New patch rebased to trunk and addresses Neha's comments:

1. Changed delete retention to 24 hours
2. Fixed broken logic in warning statement so it warns when your buffer is too 
big.
3. Yes, that was in the patch, just got lost in the conflict?
4. Dump log segments was printing the value as the key, fixed.
5. SimpleKafkaETLMapper didn't handle null. This isn't an easy fix since the 
text format doesn't have an out of range marker to represent null. Returning 
empty string which is ambiguous but better than crashing.
6. Linear probing has the problem that it tends to lead to "runs". I.e. if you 
have a fixed probing step size of N then if you have a collision the 
probability that the spot M slots over is full is going to be higher. So the 
ideal probing approach would be a sequence of fully random hashes which were 
completely uncorrelated with one another. That is the motivation for using the 
rest of the md5 before degrading to linear probing since we have already 
computed 16 bytes of random hash. The second question is wether it is legit to 
increment byte by byte or not since this effectively reuses bytes of the hash. 
I agree it is a little sketchy, though it does seem to work.
7. Clarified the purpose of dump logs.
                
> Handle null values in Message payload
> -------------------------------------
>
>                 Key: KAFKA-739
>                 URL: https://issues.apache.org/jira/browse/KAFKA-739
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Jay Kreps
>            Assignee: Jay Kreps
>             Fix For: 0.8.1
>
>         Attachments: KAFKA-739-v1.patch, KAFKA-739-v2.patch
>
>
> Add tests for null message payloads in producer, server, and consumer.
> Ensure log cleaner treats these as deletes.
> Test that null keys are rejected on dedupe logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to