[ https://issues.apache.org/jira/browse/KAFKA-791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605337#comment-13605337 ]
John Fung commented on KAFKA-791: --------------------------------- Uploaded kafka-791-v3.patch with additional changes. The following validation functions : * validate_data_matched * validate_simple_consumer_data_matched * validate_simple_consumer_data_matched_across_replicas * validate_data_matched_in_multi_topics_from_single_consumer_producer are modified with these common behaviors : * producer MessageID list is always converted to a set (deduped) * consumer MessageID list is left as is * data loss / mismatch are compared by removing consumer MessageID from producer MessageID set * any duplicates in consumer MessageID will be treated as failure * Ack=1 test case data loss failure threshold is set to 5% * ordering of MessageID is not validated For the function validate_simple_consumer_data_matched_across_replicas : * compare each list (no dedupe) of consumer MessageID associated with its topic-partition in each replica * any MessageID mismatch in a certain topic-partition between replicas is reported as failure > Fix validation bugs in System Test > ---------------------------------- > > Key: KAFKA-791 > URL: https://issues.apache.org/jira/browse/KAFKA-791 > Project: Kafka > Issue Type: Task > Reporter: John Fung > Assignee: John Fung > Labels: replication-testing > Attachments: kafka-791-v1.patch, kafka-791-v2.patch, > kafka-791-v3.patch > > > The following issues are found in data / log checksum match in System Test: > 1. kafka_system_test_utils.validate_simple_consumer_data_matched > It reports PASSED even some log segments don't match > 2. kafka_system_test_utils.validate_data_matched (this is fixed and patched > in local Hudson for some time) > It reports PASSED in the Ack=1 cases even data loss is greater than the > tolerance (1%). > 3. kafka_system_test_utils.validate_simple_consumer_data_matched > It gets a unique set of MessageID to validate. It should leave all MessageID > as is (no dedup needed) and the test case should fail if sorted MessageID > don't match across the replicas. > 4. There is a data loss tolerance of 1% in the test cases of Ack=1. Currently > 1% is too strict and seeing some random failures due to 2 ~ 3% of data loss. > It will be increased to 5% such that the System Test will get a more > consistent passing rate in those test cases. The following will be updated to > 5% tolerance in kafka_system_test_utils: > validate_data_matched > validate_simple_consumer_data_matched > validate_data_matched_in_multi_topics_from_single_consumer_producer -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira