Hi Vitor, I'm not an expert and probably some more knowledgeable folks can also chime in (and correct me) but a few things came to mind:
1) On the write side (i.e. when using the publisher), Kafka does not flush data to disk by default. It writes to the page cache so all writes are sort of in-memory in a way. They're staged in the page cache and the kernel flushes the data asynchronously. Also the API contract for Kafka is quite "simple" in that it mostly reads and writes arbitrary sequences of bytes - there isn't as much complex transactional software in front of the writing/reading that might hurt performance compared to some other data stores. Note, Kafka does provide things like idempotence and transactions so it's not like there is never any overhead to consider. 2) Kafka reads and writes are conducive to being linear which helps a lot with performance. Random writes are a lot slower than linear ones. 3) For reading (i.e. when using the consumer) data Kafka uses a zero-copy technique in which data is directly sent from the page cache to the network buffer without going through user space which helps a lot. 4) Kafka batches aggressively. Here are two resources which might provide more information https://docs.confluent.io/platform/current/kafka/design.html, https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines . Hope this helps a bit. Andrew On Thu, Oct 14, 2021 at 1:11 PM Vitor Augusto de Medeiros < v.medei...@aluno.ufabc.edu.br> wrote: > Hi everyone, > > i'm doing a benchmark comparison between Kafka and Redis for my final > bachelor paper and would like to understand more about why Kafka have > higher throughput if compared to Redis. > > I noticed Redis has lower overall latency (and makes sense since it's > stored in memory) but cant figure out the difference in throughput. > > I found a study (not sure if i can post links here but it's named A > COMPARISON OF DATA INGESTION PLATFORMS IN REAL-TIME STREAM PROCESSING > PIPELINES by Sebastian Tallberg) > showing Kafka's throughput hitting 3x the amount of msg/s if compared to > Redis for a 1kB payload. I would like to understand what is in Kafka's > architecture that allows it to be a lot faster than other message > brokers/Redis in particular > > Thanks! > -- Andrew Grant 8054482621