Noticed this s3 based consumer project on github
https://github.com/razvan/kafka-s3-consumer
On Dec 27, 2012, at 7:08 AM, David Arthur wrote:
> I don't think anything exists like this in Kafka (or contrib), but it would
> be a useful addition! Personally, I have written this exact thing at p
Hi, Everyone,
Just want to let people know that there will be a Kafka presentation
(focusing on replication) at ApacheCon in Feb 2013.
http://na.apachecon.com/schedule/presentation/115/
We also plan to have a Kafka meetup.
http://wiki.apache.org/apachecon/ApacheMeetupsNA13
Please sign up for Apa
Would you please contribute this to open source? What you've written
has been asked for many times. FWIW, I would immediately incorporate
it into my book, Agile Data.
Russell Jurney http://datasyndrome.com
On Dec 28, 2012, at 8:06 AM, Liam Stewart wrote:
> We have a tool that reads data continu
At LinkedIn, the most common failure of a Kafka broker is when we have to
deploy new Kafka code/config. Otherwise, the broker can be up for a long
time (e..g, months). It woud be good to monitor the following metrics at
the broker: log flush time/rate, produce/fetch requests/messages rate, GC
rate/
This is a known bug in Kafka 0.7.x. Basically, for a new topic, we
bootstrap using all existing brokers. However, if a topic already exists on
some brokers, we never bootstrap again, which means new brokers will be
ignored. For now, you have to manually create the topic on the new brokers
(e.g., by
Then, compression won't help. Try increasing the heap size. If that doesn't
help, you may need to use more brokers.
Thanks,
Jun
On Thu, Dec 27, 2012 at 10:26 PM, xingcan wrote:
> Jun,
>
> Out messages are not plain text. Most of them are JPEG files. I'm not sure
> if compression will be useful
Update: Early (1 week) implementation of Node-Kafka has resulted in the
following observations:
1. Consumer is unstable.
2. If use of Consumer is mandatory, create the Consumer in application-scope,
not request-scope.
3. Attempt to close Consumer on application shutdown. Results of unplanned
sh
Hi! I'm playing with kafka using the following setup:
3 zk nodes ensemble
2 brokers:
* num_partitions:3
* topic.partition.count.map=test-topic:5
My producer connects to brokers using zk.connect. When the producer sends
messages to the "test-topic" topic, the partitions are created on both
b
We have a tool that reads data continuously from brokers and then writes
files to S3. A MR job didn't make sense for us given our current size and
volume. We have one instance running right now and could add more by if
needed, adjusting which instance reads from which brokers/topics/...
Unfortunate
Hi Matthew,
I may be doing something wrong.
I cloned the code at
https://github.com/apache/kafka/tree/trunk/contrib/hadoop-consumer
I am running following :
- ./run-class.sh kafka.etl.impl.DataGenerator test/test.properties which
generates a /tmp/kafka/data/1.dat file containing
Dump tcp://local
So the hadoop consumer does use the latest offset, it reads it from the
'input' directory in the record reader.
We have a heavily modified version of the hadoop consumer that reads /
writes offsets to zookeeper [much like the scala consumers] and this works
great.
FWIW we also use the hadoop cons
I went through the source code of Hadoop consumer in contrib. It doesn't
seem to be using previous offset at all. Neither in Data Generator or in
Map reduce stage.
Before I go into the implementation, I can think of 2 ways :
1. A consumerconnector receiving all the messages continuously, and then
Hi! I'm playing with kafka 0.7.2 using the following setup:
3 zk nodes ensemble
2 brokers:
* num_partitions:3
* topic.partition.count.map=test-topic:5
My producer connects to brokers using "zk.connect" property. When the
producer sends messages to the "test-topic" topic, the partitions are
13 matches
Mail list logo