Does not look like it has been updated for 0.8, but you may want to
check with the author directly.
On Tue, Jan 07, 2014 at 08:38:04PM -0500, Ray Rodriguez wrote:
> Will the current kafka-s3-consumer (
> https://github.com/razvan/kafka-s3-consumer) work with 0.8.0?
>
> Ray Rodrigue
Will the current kafka-s3-consumer (
https://github.com/razvan/kafka-s3-consumer) work with 0.8.0?
Ray Rodriguez
Medidata Solutions
Noticed this s3 based consumer project on github
https://github.com/razvan/kafka-s3-consumer
On Dec 27, 2012, at 7:08 AM, David Arthur wrote:
> I don't think anything exists like this in Kafka (or contrib), but it would
> be a useful addition! Personally, I have written this exa
Would you please contribute this to open source? What you've written
has been asked for many times. FWIW, I would immediately incorporate
it into my book, Agile Data.
Russell Jurney http://datasyndrome.com
On Dec 28, 2012, at 8:06 AM, Liam Stewart wrote:
> We have a tool that reads data continu
We have a tool that reads data continuously from brokers and then writes
files to S3. A MR job didn't make sense for us given our current size and
volume. We have one instance running right now and could add more by if
needed, adjusting which instance reads from which brokers/topics/...
Unfortunate
Hi Matthew,
I may be doing something wrong.
I cloned the code at
https://github.com/apache/kafka/tree/trunk/contrib/hadoop-consumer
I am running following :
- ./run-class.sh kafka.etl.impl.DataGenerator test/test.properties which
generates a /tmp/kafka/data/1.dat file containing
Dump tcp://local
So the hadoop consumer does use the latest offset, it reads it from the
'input' directory in the record reader.
We have a heavily modified version of the hadoop consumer that reads /
writes offsets to zookeeper [much like the scala consumers] and this works
great.
FWIW we also use the hadoop cons
I went through the source code of Hadoop consumer in contrib. It doesn't
seem to be using previous offset at all. Neither in Data Generator or in
Map reduce stage.
Before I go into the implementation, I can think of 2 ways :
1. A consumerconnector receiving all the messages continuously, and then
I don't think anything exists like this in Kafka (or contrib), but it
would be a useful addition! Personally, I have written this exact thing
at previous jobs.
As for the Hadoop consumer, since there is a FileSystem implementation
for S3 in Hadoop, it should be possible. The Hadoop consumer wo