+1 for detailed examination of metrics. You can see the main metrics here:
https://kafka.apache.org/documentation.html#monitoring Jconsole is very helpful for looking quickly at what is going on. Cheers, Robert On Sun, Jun 15, 2014 at 7:49 AM, pushkar priyadarshi < priyadarshi.push...@gmail.com> wrote: > and one more thing.using kafka metrices you can easily monitor at what rate > you are able to publish on to kafka and what speed your consumer(in this > case your spout) is able to drain messages out of kafka.it's possible that > due to slowly draining out even publishing rate in worst case might get > effected as if consumer lags behind too much then it will result into disk > seeks while consuming the older messages. > > > On Sun, Jun 15, 2014 at 8:16 PM, pushkar priyadarshi < > priyadarshi.push...@gmail.com> wrote: > > > what throughput are you getting from your kafka cluster alone?Storm > > throughput can be dependent on what processing you are actually doing > from > > inside it.so must look at each component starting from kafka first. > > > > Regards, > > Pushkar > > > > > > On Sat, Jun 14, 2014 at 8:44 PM, Shaikh Ahmed <rnsr.sha...@gmail.com> > > wrote: > > > >> Hi, > >> > >> Daily we are downloaded 28 Million of messages and Monthly it goes up to > >> 800+ million. > >> > >> We want to process this amount of data through our kafka and storm > cluster > >> and would like to store in HBase cluster. > >> > >> We are targeting to process one month of data in one day. Is it > possible? > >> > >> We have setup our cluster thinking that we can process million of > messages > >> in one sec as mentioned on web. Unfortunately, we have ended-up with > >> processing only 1200-1700 message per second. if we continue with this > >> speed than it will take min 10 days to process 30 days of data, which is > >> the relevant solution in our case. > >> > >> I suspect that we have to change some configuration to achieve this > goal. > >> Looking for help from experts to support me in achieving this task. > >> > >> *Kafka Cluster:* > >> Kafka is running on two dedicated machines with 48 GB of RAM and 2TB of > >> storage. We have total 11 nodes kafka cluster spread across these two > >> servers. > >> > >> *Kafka Configuration:* > >> producer.type=async > >> compression.codec=none > >> request.required.acks=-1 > >> serializer.class=kafka.serializer.StringEncoder > >> queue.buffering.max.ms=100000 > >> batch.num.messages=10000 > >> queue.buffering.max.messages=100000 > >> default.replication.factor=3 > >> controlled.shutdown.enable=true > >> auto.leader.rebalance.enable=true > >> num.network.threads=2 > >> num.io.threads=8 > >> num.partitions=4 > >> log.retention.hours=12 > >> log.segment.bytes=536870912 > >> log.retention.check.interval.ms=60000 > >> log.cleaner.enable=false > >> > >> *Storm Cluster:* > >> Storm is running with 5 supervisor and 1 nimbus on IBM servers with 48 > GB > >> of RAM and 8TB of storage. These servers are shared with hbase cluster. > >> > >> *Kafka spout configuration* > >> kafkaConfig.bufferSizeBytes = 1024*1024*8; > >> kafkaConfig.fetchSizeBytes = 1024*1024*4; > >> kafkaConfig.forceFromStart = true; > >> > >> *Topology: StormTopology* > >> Spout - Partition: 4 > >> First Bolt - parallelism hint: 6 and Num tasks: 5 > >> Second Bolt - parallelism hint: 5 > >> Third Bolt - parallelism hint: 3 > >> Fourth Bolt - parallelism hint: 3 and Num tasks: 4 > >> Fifth Bolt - parallelism hint: 3 > >> Sixth Bolt - parallelism hint: 3 > >> > >> *Supervisor configuration:* > >> > >> storm.local.dir: "/app/storm" > >> storm.zookeeper.port: 2181 > >> storm.cluster.mode: "distributed" > >> storm.local.mode.zmq: false > >> supervisor.slots.ports: > >> - 6700 > >> - 6701 > >> - 6702 > >> - 6703 > >> supervisor.worker.start.timeout.secs: 180 > >> supervisor.worker.timeout.secs: 30 > >> supervisor.monitor.frequency.secs: 3 > >> supervisor.heartbeat.frequency.secs: 5 > >> supervisor.enable: true > >> > >> storm.messaging.netty.server_worker_threads: 2 > >> storm.messaging.netty.client_worker_threads: 2 > >> storm.messaging.netty.buffer_size: 52428800 #50MB buffer > >> storm.messaging.netty.max_retries: 25 > >> storm.messaging.netty.max_wait_ms: 1000 > >> storm.messaging.netty.min_wait_ms: 100 > >> > >> > >> supervisor.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true" > >> worker.childopts: "-Xmx2048m -Djava.net.preferIPv4Stack=true" > >> > >> > >> Please let me know if more information needed.. > >> > >> Thanks in advance. > >> > >> Regards, > >> Riyaz > >> > > > > >