hi Xiang > 1. Which version(s) is the most proper version to go to if I want to get > the core design and implementation of Kafka so that it won’t be too old and > not too new either.
The code in the trunk branch is the best choice if you want to understand Kafka's most up-to-date architecture :) > 2. If I am trying to read source code, which subprojects(folders) should I > cover so that they make up the backbone of Kafka and in what order. My suggestion is to start by reading the code of components you frequently use. For example, the producer is often a good starting point. We've done a lot of optimization for the producer, such as memory pooling, batching, and the built-in partitioner. Additionally, there are many useful and interesting mechanisms in the producer, like idempotence and transactions. > 3. Is there any other material that I can reference to before and during > jumping into source code ? my answer is same to colin - https://www.confluent.io/resources/ebook/kafka-the-definitive-guide/ > 4. Last but not least, is there any Kafka committer or contributor who can > share their get-started experience as all the email threads seem organized > yet a little overwhelmed? My suggestion is simple: read, read, and read. Update outdated docs by reading the official documentation (https://kafka.apache.org/documentation/). Fix potential bugs by reading the source code. Improve performance by reading the metrics. Collaborate the patch by reading other contributor's PR. A lot of reading helped me make numerous contributions to Kafka, even though I wasn't an actual Kafka user. > 0. Is reading source code the best way to learn more about Kafka ? Yes, definitely. By the way, welcome to the Kafka community! :) Best, Chia-Ping