Jason Gustafson created KAFKA-9835: -------------------------------------- Summary: Race condition with concurrent write allows reads above high watermark Key: KAFKA-9835 URL: https://issues.apache.org/jira/browse/KAFKA-9835 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson Assignee: Jason Gustafson
Kafka's log implementation serializes all writes using a lock, but allows multiple concurrent reads while that lock is held. The `FileRecords` class contains the core implementation. Reads to the log create logical slices of `FileRecords` which are then passed to the network layer for sending. An abridged version of the implementation of `slice` is provided below: {code} public FileRecords slice(int position, int size) throws IOException { int end = this.start + position + size; // handle integer overflow or if end is beyond the end of the file if (end < 0 || end >= start + sizeInBytes()) end = start + sizeInBytes(); return new FileRecords(file, channel, this.start + position, end, true); } {code} The `size` parameter here is typically derived from the fetch size, but is upper-bounded with respect to the high watermark. The two calls to `sizeInBytes` here are problematic because the size of the file may change in between them. Specifically a concurrent write may increase the size of the file after the first call to `sizeInBytes` but before the second one. In the worst case, when `size` defines the limit of the high watermark, this can lead to a slice containing uncommitted data. -- This message was sent by Atlassian Jira (v8.3.4#803005)