Efficient Kafka batch processing

Dominik Safaric Sat, 10 Dec 2016 10:30:42 -0800

Hi everyone,

What is among the most efficient ways to fast consume, transform and process 
Kafka messages? Importantly, I am not referring nor interested in streams, 
because the Kafka topic from which I would like to process the messages will 
eventually stop receiving messages, after which I should process the messages 
by extracting certain keys in a batch processing like manner.


So far I’ve implemented a a Kafka Consumer group that consumers these messages, 
hashes them according to a certain key, and upon retrieval of the last message 
starts the processing script. However, I am dealing with exactly 100.000.000 
log messages, each of 16 bytes, meaning that preserving 1.6GB of data in-memory 
i.e. on heap is not the most efficient manner - performance and memory wise.

Regards,
Dominik

Efficient Kafka batch processing

Reply via email to