Re: Proper use of ConsumerConnector

2012-12-21 Thread 永辉 赵
Thanks Neha and Joel. My understanding about offset is: 1. Offset stored in zk is only used when the consumer is connected again. 2. Joel's suggestion "in fact setting an autocommit interval and being willing to deal with duplicates is almost equivalent. " makes sense. But if crash happens jus

Could not initialize class kafka.utils.Log4jController$ ?

2012-12-21 Thread Jason Huang
Hello, I am writing a simple java client that uses SimpleConsumer to fetch messages from a Kafka Server: .. KafkaConnectionParams connection = new KafkaConnectionParams(); SimpleConsumer consumer = new SimpleConsumer(connection.getKafkaServerURL(), connection.

Re: Could not initialize class kafka.utils.Log4jController$ ?

2012-12-21 Thread Jason Huang
After thinking about this a bit more, I feel like this may be related to my JBoss log4j setting. I will try to run it locally (without JBoss) and see if I can get messages from the Kafka Server. However, any thoughts will be greatly appreciated! thanks, Jason On Fri, Dec 21, 2012 at 10:59 AM, J

Re: Proper use of ConsumerConnector

2012-12-21 Thread Neha Narkhede
> But if crash happens just after offset committed, then unprocessed > message in consumer will be skipped after reconnected. > If the consumer crashes, you will get duplicates, not lose any data. Thanks, Neha

Re: Proper use of ConsumerConnector

2012-12-21 Thread Tom Brown
Does the ConsumerConnector keep track of the offsets of data downloaded from the server (and queued for consumption by the end user of the API), or does it keep track of the actual offset that has been consumed by the end user? --Tom On Fri, Dec 21, 2012 at 10:37 AM, Neha Narkhede wrote: >> But

Re: Proper use of ConsumerConnector

2012-12-21 Thread Jun Rao
It tracks the consumed offset, not the fetched offset. Thanks, Jun On Fri, Dec 21, 2012 at 9:46 AM, Tom Brown wrote: > Does the ConsumerConnector keep track of the offsets of data > downloaded from the server (and queued for consumption by the end user > of the API), or does it keep track of t

Re: Http based producer

2012-12-21 Thread David Arthur
Pratyush, I'm not a big node.js user so I can't speak to any of the node.js clients. I mostly use the Java/Scala client. Some clients attempt to support the ZooKeeper consumer coordination, some don't (since it is hard to get right). There is work in progress within Kafka to simplify the consu

Re: Kafka Node.js Integration Questions/Advice

2012-12-21 Thread Radek Gruchalski
We are using https://github.com/radekg/node-kafka, occasionally pushing about 2500 messages, 3.5K each / second. No issues so far. Different story with consumers. They are stable but under heavy load we experienced CPU problems. I am the maintainer of that fork. The fork comes with ZK integratio

Re: Proper use of ConsumerConnector

2012-12-21 Thread Yonghui Zhao
In our project we use senseidb to consume kafka data. Senseidb will process the message immediately but won't flush to disk immeidately. So if senseidb crash then all result not flushed will be lost, we want to rewind kafka. The offset we want to rewind to is the flush checkpoint. In this case,

Re: Proper use of ConsumerConnector

2012-12-21 Thread Tom Brown
It seems that a common thread is that while ConsumerConnector works well for the standard case, it just doesn't work for any case where manual offset management (explicit checkpoints, rollbacks, etc) is required. If any Kafka devs are looking for a way to improve it, I think modifying it to be mor

Re: Kafka Node.js Integration Questions/Advice

2012-12-21 Thread Apoorva Gaurav
Which is the best ZK based implementation of kafka in node.js. Our use case is that a pool of node js http servers will be listening to clients which will send json over http. Using node js we'll do minimal decoration and compression (preferably snappy) and write to brokers. We might also need json