I am new to Kafka so please excuse me if this is a very basic question. I have a cluster set up with 3 zookeepers and 9 brokers. I have network security logs flowing into the kafka cluster. I am using logstash to read them from the cluster and ingest them into an elasticsearch cluster.
My current settings are mostly default. I created a topic with 8 partitions. I have 4 logstash consumers reading that topic and feeding my ES cluster. My problem is I can't keep up with real time at the moment. I am constantly falling behind and logs are building on my kafka cluster. When I run: $ /opt/kafka/bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group logstash --zookeeper localhost:2181 --topic bro-logs I get the following: logstash bro-logs 0 25937394 29935485 3998091 logstash_OP-01-VM-553-1457301346564-d14fd84a-0 logstash bro-logs 1 25929594 29935506 4005912 logstash_OP-01-VM-553-1457301346564-d14fd84a-0 logstash bro-logs 2 26710728 29935519 3224791 logstash_OP-01-VM-554-1457356976268-fa8c24b9-0 logstash bro-logs 3 3887940 6372075 2484135 logstash_OP-01-VM-554-1457356976268-fa8c24b9-0 logstash bro-logs 4 3978342 6372074 2393732 logstash_OP-01-VM-555-1457368235387-c6b8bd1f-0 logstash bro-logs 5 3984965 6372075 2387110 logstash_OP-01-VM-555-1457368235387-c6b8bd1f-0 logstash bro-logs 6 4017715 6372076 2354361 logstash_OP-01-VM-556-1457368464998-8edb13df-0 logstash bro-logs 7 4022484 6372074 2349590 logstash_OP-01-VM-556-1457368464998-8edb13df-0 from what I understand the Lag column is telling me that there are a hole bunch of logs waiting in the cluster to be processed. So my question is, should I spin up more logstash consumers to read from the kafka cluster and feed the ES cluster? Should I increase or decrease partitions? What can be done to increase the amount of logs being read from the cluster and ingested into Elastisearch? Like I said, very new to kafka. Thanks for the help Tim