[ https://issues.apache.org/jira/browse/KAFKA-739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jay Kreps updated KAFKA-739: ---------------------------- Attachment: KAFKA-739-v2.patch New patch rebased to trunk and addresses Neha's comments: 1. Changed delete retention to 24 hours 2. Fixed broken logic in warning statement so it warns when your buffer is too big. 3. Yes, that was in the patch, just got lost in the conflict? 4. Dump log segments was printing the value as the key, fixed. 5. SimpleKafkaETLMapper didn't handle null. This isn't an easy fix since the text format doesn't have an out of range marker to represent null. Returning empty string which is ambiguous but better than crashing. 6. Linear probing has the problem that it tends to lead to "runs". I.e. if you have a fixed probing step size of N then if you have a collision the probability that the spot M slots over is full is going to be higher. So the ideal probing approach would be a sequence of fully random hashes which were completely uncorrelated with one another. That is the motivation for using the rest of the md5 before degrading to linear probing since we have already computed 16 bytes of random hash. The second question is wether it is legit to increment byte by byte or not since this effectively reuses bytes of the hash. I agree it is a little sketchy, though it does seem to work. 7. Clarified the purpose of dump logs. > Handle null values in Message payload > ------------------------------------- > > Key: KAFKA-739 > URL: https://issues.apache.org/jira/browse/KAFKA-739 > Project: Kafka > Issue Type: Bug > Reporter: Jay Kreps > Assignee: Jay Kreps > Fix For: 0.8.1 > > Attachments: KAFKA-739-v1.patch, KAFKA-739-v2.patch > > > Add tests for null message payloads in producer, server, and consumer. > Ensure log cleaner treats these as deletes. > Test that null keys are rejected on dedupe logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira