Kafka lost data

2014-10-27 Thread Chen Wang
Hello folks, I recently noticed our message amount in kafka seems to have dropped significantly. I didn't see any exception on my consumer side. Since producer is not within my control, I am trying to get some guidance on how I could debug this issue. Our individual message size recently have inc

Consumer keeps looking connection

2014-11-01 Thread Chen Wang
Hello Folks, I am using Highlevel consumer, and it seems to drop connections intermittently: 2014-11-01 13:34:40 SimpleConsumer [INFO] Reconnect due to socket error: Received -1 when reading from channel, socket has likely been closed. 2014-11-01 13:34:40 ConsumerFetcherThread [WARN] [ConsumerFetc

Re: Consumer keeps looking connection

2014-11-02 Thread Chen Wang
(SocketServer.scala:405) at kafka.network.Processor.run(SocketServer.scala:265) at java.lang.Thread.run(Thread.java:744) On Sat, Nov 1, 2014 at 9:46 PM, Chen Wang wrote: > Hello Folks, > I am using Highlevel consumer, and it seems to drop connections > intermittently: > > 2014

Consumer lag keep increasing

2014-11-05 Thread Chen Wang
Hey Guys, I have a really simply storm topology with a kafka spout, reading from kafka through high level consumer. Since the topic has 30 partitions, we have 30 threads in the spout reading from it. However, it seems that the lag keeps increasing even the thread only read the message and do nothin

Re: Consumer lag keep increasing

2014-11-05 Thread Chen Wang
mps and > check if your consumer is blocked on some locks? > > Guozhang > > On Wed, Nov 5, 2014 at 2:01 PM, Chen Wang > wrote: > > > Hey Guys, > > I have a really simply storm topology with a kafka spout, reading from > > kafka through high level consumer. Sin

change retention for a topic on the fly does not work

2014-11-10 Thread Chen Wang
Hey guys, i am using kafka_2.9.2-0.8.1.1 bin/kafka-topics.sh --zookeeper localhost:2182 --alter --topic my_topic --config log.retention.hours.per.topic=48 It says: Error while executing topic command requirement failed: Unknown configuration "log.retention.hours.per.topic". java.lang.IllegalArg

Re: change retention for a topic on the fly does not work

2014-11-11 Thread Chen Wang
For those who might need to do the same thing, the command is bin/kafka-topics.sh --zookeeper localhost:2182 --alter --topic yourconfig --config retention.ms=17280 On Mon, Nov 10, 2014 at 4:46 PM, Chen Wang wrote: > Hey guys, > i am using kafka_2.9.2-0.8.1.1 > > bin/kaf

Broker keeps rebalancing

2014-11-12 Thread Chen Wang
Hi there, My kafka client is reading a 3 partition topic from kafka with 3 threads distributed on different machines. I am seeing frequent owner changes on the topics when running: bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group my_test_group --topic mytopic -zkconnect localhost:21

Re: Broker keeps rebalancing

2014-11-13 Thread Chen Wang
> > Guozhang > > On Wed, Nov 12, 2014 at 5:31 PM, Neha Narkhede > wrote: > > > Does this help? > > > > > https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog > > ? > > > > On Wed, Nov 12, 2014 at

Re: Broker keeps rebalancing

2014-11-13 Thread Chen Wang
> > > > server.1=.com:2888:3888 > > > > server.2=.com:2888:3888 > > > > server.3=.com:2888:3888 > > > > > > On Thu, Nov 13, 2014 at 10:27 AM, Guozhang Wang > > wrote: > > > > > Chen, > > > > > &

Re: Broker keeps rebalancing

2014-11-14 Thread Chen Wang
/10.93.83.50:44094 which had sessionid 0x149a4cc1b581b5b Chen On Thu, Nov 13, 2014 at 5:25 PM, Jun Rao wrote: > Which version of ZK are you using? > > Thanks, > > Jun > > On Thu, Nov 13, 2014 at 10:15 AM, Chen Wang > wrote: > > > Thanks for the info. > > It

Reuse ConsumerConnector

2014-12-10 Thread Chen Wang
Hey Guys, I have a user case that my thread reads from different kafka topic periodically through a timer. The way I am reading from kafka in the timer callback is the following: try { Map>> consumerMap = consumerConnector .createMessageStreams(topicCountMap); List> streamList = consumerMap

Kafka SimpleConsumer not working

2014-02-20 Thread Chen Wang
Hi, I am using kafka for the first time, and was running the sample from https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example However, I cannot read any data from kafka. The kafka has 10 partitions,and I tried to read from any of them. The fetch can succeed, however, the

Re: Kafka SimpleConsumer not working

2014-02-20 Thread Chen Wang
i am using 0.8.0. The high level api works as expected. org.apache.kafka kafka_2.10 0.8.0 On Thu, Feb 20, 2014 at 2:40 PM, Chen Wang wrote: > Hi, > I am using kafka for the first time, and was running the sample from > > https://cwiki.apache.org/confluence/display

Re: Kafka SimpleConsumer not working

2014-02-20 Thread Chen Wang
Never mind. It was actually working. I just need to wait a bit longer for data to come into the partition i was testing for. Chen On Thu, Feb 20, 2014 at 2:41 PM, Chen Wang wrote: > i am using 0.8.0. The high level api works as expected. > > > > org.apache.kafka > >

Clean up Consumer Group

2014-02-24 Thread Chen Wang
Hi, It seems that my consumers cannot be shut down properly. I can still see many unused consumers on the portal. Is there a way to get rid of all these consumers? I tried to call shutdown explicitly, but without any luck. Any help appreciated. Chen

error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-07 Thread Chen Wang
Folks, I have a process started at specific time and read from a specific topic. I am currently using the High Level API(consumer group) to read from kafka(and will stop once there is nothing in the topic by specifying a timeout). i am most concerned about error recovery in multiple thread context

Re: error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-07 Thread Chen Wang
commit offset.. On Thu, Aug 7, 2014 at 4:54 PM, Guozhang Wang wrote: > Hello Chen, > > With high-level consumer, the partition re-assignment is automatic upon > consumer failures. > > Guozhang > > > On Thu, Aug 7, 2014 at 4:41 PM, Chen Wang > wrote: > > >

Re: error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-07 Thread Chen Wang
artitions that the consumer is currently fetching, so there > is no need to coordinate this operation. > > > On Thu, Aug 7, 2014 at 5:03 PM, Chen Wang > wrote: > > > But with the auto commit turned on, I am risking off losing the failed > > message, right? should I tu

Re: error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-07 Thread Chen Wang
Just did some testing.It seems that the rebalance will occur upon *zookeeper.session.timeout.ms <http://zookeeper.session.timeout.ms>. * *So yes, if one thread died, the left over messages will be picked up by other threads.* On Thu, Aug 7, 2014 at 5:36 PM, Chen Wang wrote: > Guozhan

Re: error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-08 Thread Chen Wang
too often does have an overhead since it is going to > Zookeeper, and ZK is not write-scalable. We are also fixing that issue by > moving the offset management from ZK to kafka servers. This is already > checked in trunk, and will be included in 0.8.2 release. > > Guozhang > >

Re: error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-08 Thread Chen Wang
.. On Fri, Aug 8, 2014 at 10:40 AM, Chen Wang wrote: > Thanks,Guozhang, > So if I switch to SimpleConsumer, will these problems be taken care of > already? I would assume that I will need to manage all the offset by > myself, including the error recovery logic, right? > Chen > >

Re: error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-08 Thread Chen Wang
e and hence commit > offsets) and you could live with data duplicates, then you can just enable > auto offset commits with say, 10 secs period. We usually have even larger > period, like minutes. > > Guozhang > > > On Fri, Aug 8, 2014 at 11:11 AM, Chen Wang > wrote: >

Re: error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-08 Thread Chen Wang
i am missing here? Chen On Fri, Aug 8, 2014 at 1:09 PM, Guozhang Wang wrote: > Chen, > > You can use the ConsumerOffsetChecker tool. > > http://kafka.apache.org/documentation.html#basic_ops_consumer_lag > > Guozhang > > > On Fri, Aug 8, 2014 at 12:18 PM, Che

Re: error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-08 Thread Chen Wang
small. > Could > you try with larger numbers, like 1? > > Guozhang > > > On Fri, Aug 8, 2014 at 1:41 PM, Chen Wang > wrote: > > > Guozhang, > > I just did a simple test, and kafka does not seem to do what it is > supposed > > to do: > > I put

Re: error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-08 Thread Chen Wang
Guozhang, Just curious, do you guys already have a java version of the ConsumerOffsetChecker https://github.com/apache/kafka/blob/0.8/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala so that I could use it in my storm topology? Chen On Fri, Aug 8, 2014 at 2:03 PM, Chen Wang wrote

Issue with 240 topics per day

2014-08-11 Thread Chen Wang
Folks, Is there any potential issue with creating 240 topics every day? Although the retention of each topic is set to be 2 days, I am a little concerned that since right now there is no delete topic api, the zookeepers might be overloaded. Thanks, Chen

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
r it is too large for Zookeeper¹s default 1 MB data size. > > You also need to think about the number of open file handles. Even with no > data, there will be open files for each topic. > > -Todd > > > On 8/11/14, 2:19 PM, "Chen Wang" wrote: > > >Folks, &g

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
n time range? I'm not sure it makes sense to use > Kafka in this manner. > > Can you provide more detail? > > > Philip > > > - > http://www.philipotoole.com > > > On Monday, August 11, 2014 4:45 PM, Chen Wang >

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
can run enough consumers such that you can keep up. The fact that > you are thinking about so many topics is a sign your design is wrong, or > Kafka is the wrong solution. > > Philip > > > On Aug 11, 2014, at 5:18 PM, Chen Wang > wrote: > > > > Philip, > > Tha

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
uickly run out of file descriptors, amongst other issues. > > Philip > > > > > ----- > http://www.philipotoole.com > > > On Aug 11, 2014, at 6:42 PM, Chen Wang > wrote: > > > > "And if you can't consume it all within 6 minutes, partiti

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
on at all and same topic with multiple partitions would > get you what you need. > > > On Tue, Aug 12, 2014 at 8:04 AM, Chen Wang > wrote: > > > Those data has a timestamp: its actually email campaigns with scheduled > > send time. But since they can be scheduled ahe

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
a combination of the above approach with day-long > topics might be a good compromise. Depends how badly you want to use Kafka. > > Philip > > -- > http://www.philipotoole.com > > > > > > On Aug 11, 2014, at 7:34 PM, Chen Wang >

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
n proceed to topic2 at > 12:06, and so on. The next week, you loop around over exactly the same > topics, knowing that the retention settings have cleared out the old data. > > -Todd > > On 8/11/14, 4:45 PM, "Chen Wang" wrote: > > >Todd, > >I actually only

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
t we push through it. If you can > have at least 4 or 5 brokers, I wouldn’t anticipate any problems with the > number of partitions. You may need more than that depending on the > throughput you want to handle. > > -Todd > > On 8/11/14, 9:20 PM, "Chen Wang" wrote: > &g

Correct way to handle ConsumerTimeoutException

2014-08-12 Thread Chen Wang
Folks, I am using consumer.timeout.ms to force a consumer jump out hasNext call, which will throw ConsumerTimeoutException. It seems that upon receiving this exception, the consumer is no longer usable and I need to call .shutdown, and recreate: try{ } catch (ConsumerTimeoutException ex) { logge

Re: Correct way to handle ConsumerTimeoutException

2014-08-14 Thread Chen Wang
> > > > If you want to restart the consumer in handling the timeout exception, > then > > you should probably just increasing the timeout value in the configs to > > avoid it throwing timeout exception. > > > > Guozhang > > > > > > On T

How many threads should I use per topic

2014-08-14 Thread Chen Wang
Hey, Guys, I am using the high level consumer. I have a daemon process that checks the lag for a topic. Suppose I have a topic with 5 partitions, and partition 0, 1 has lag of 0, while the other 3 all have lags. In this case, should I best start 3 threads, or 5 threads to read from this topic again