Re: Kafka/Hadoop consumers and producers

Andrew Psaltis Thu, 08 Aug 2013 20:23:34 -0700

Felix,
The Camus route is the direction I have headed for allot of the reasons 
that you described. The only wrinkle is we are still on Kafka 0.7.3 so I am 
in the process of back porting this patch: 
https://github.com/linkedin/camus/commit/87917a2aea46da9d21c8f67129f6463af52f7aa8
 that 
is described 
here: https://groups.google.com/forum/#!topic/camus_etl/VcETxkYhzg8 -- so 
that we can handle reading and writing non-avro'ized (if that is a word) 
data.


I hope to have that done sometime in the morning and would be happy to 
share it if others can benefit from it.

Thanks,
Andrew


On Thursday, August 8, 2013 7:18:27 PM UTC-6, Felix GV wrote:
>
> The contrib code is simple and probably wouldn't require too much work to 
> fix, but it's a lot less robust than Camus, so you would ideally need to do 
> some work to make it solid against all edge cases, failure scenarios and 
> performance bottlenecks...
>
> I would definitely recommend investing in Camus instead, since it already 
> covers a lot of the challenges I'm mentioning above, and also has more 
> community support behind it at the moment (as far as I can tell, anyway), 
> so it is more likely to keep getting improvements than the contrib code.
>
> --
> Felix
>
>
> On Thu, Aug 8, 2013 at 9:28 AM, <psaltis...@gmail.com <javascript:>>wrote:
>
>> We also have a need today to ETL from Kafka into Hadoop and we do not 
>> currently nor have any plans to use Avro.
>>
>> So is the official direction based on this discussion to ditch the Kafka 
>> contrib code and direct people to use Camus without Avro as Ken described 
>> or are both solutions going to survive?
>>
>> I can put time into the contrib code and/or work on documenting the 
>> tutorial on how to make Camus work without Avro.
>>
>> Which is the preferred route, for the long term?
>>
>> Thanks,
>> Andrew
>>
>> On Wednesday, August 7, 2013 10:50:53 PM UTC-6, Ken Goodhope wrote:
>> > Hi Andrew,
>> >
>> >
>> >
>> > Camus can be made to work without avro. You will need to implement a 
>> message decoder and and a data writer.   We need to add a better tutorial 
>> on how to do this, but it isn't that difficult. If you decide to go down 
>> this path, you can always ask questions on this list. I try to make sure 
>> each email gets answered. But it can take me a day or two.
>> >
>> >
>> >
>> > -Ken
>> >
>> >
>> >
>> > On Aug 7, 2013, at 9:33 AM, ao...@wikimedia.org <javascript:> wrote:
>> >
>> >
>> >
>> > > Hi all,
>> >
>> > >
>> >
>> > > Over at the Wikimedia Foundation, we're trying to figure out the best 
>> way to do our ETL from Kafka into Hadoop.  We don't currently use Avro and 
>> I'm not sure if we are going to.  I came across this post.
>> >
>> > >
>> >
>> > > If the plan is to remove the hadoop-consumer from Kafka contrib, do 
>> you think we should not consider it as one of our viable options?
>> >
>> > >
>> >
>> > > Thanks!
>> >
>> > > -Andrew
>> >
>> > >
>> >
>> > > --
>> >
>> > > You received this message because you are subscribed to the Google 
>> Groups "Camus - Kafka ETL for Hadoop" group.
>> >
>> > > To unsubscribe from this group and stop receiving emails from it, 
>> send an email to camus_etl+...@googlegroups.com <javascript:>.
>> >
>> > > For more options, visit https://groups.google.com/groups/opt_out.
>> >
>> > >
>> >
>> > >
>>
>>
>

Re: Kafka/Hadoop consumers and producers

Reply via email to