I think the answer is that there is currently no strong community-backed solution to consume non-Avro data from Kafka to HDFS.
A lot of people do it, but I think most people adapted and expanded the contrib code to fit their needs. -- Felix On Fri, Aug 9, 2013 at 1:27 PM, Oleg Ruchovets <oruchov...@gmail.com> wrote: > Yes , I am definitely interested with such capabilities. We also using > kafka 0.7. > Guys I already asked , but nobody answer: what community using to > consume from kafka to hdfs? > My assumption was that if Camus support only Avro it will not be suitable > for all , but people transfer from kafka to hadoop somehow. So the question > is what is the alternatives to Camus to transfer messages from kafka to > hdfs? > Thanks > Oleg. > > > On Fri, Aug 9, 2013 at 6:21 AM, Andrew Psaltis <psaltis.and...@gmail.com > >wrote: > > > Felix, > > The Camus route is the direction I have headed for allot of the reasons > > that you described. The only wrinkle is we are still on Kafka 0.7.3 so I > am > > in the process of back porting this patch: > > > https://github.com/linkedin/camus/commit/87917a2aea46da9d21c8f67129f6463af52f7aa8that > > is described here: > > https://groups.google.com/forum/#!topic/camus_etl/VcETxkYhzg8 -- so that > > we can handle reading and writing non-avro'ized (if that is a word) data. > > > > I hope to have that done sometime in the morning and would be happy to > > share it if others can benefit from it. > > > > Thanks, > > Andrew > > > > > > On Thursday, August 8, 2013 7:18:27 PM UTC-6, Felix GV wrote: > > > >> The contrib code is simple and probably wouldn't require too much work > to > >> fix, but it's a lot less robust than Camus, so you would ideally need > to do > >> some work to make it solid against all edge cases, failure scenarios and > >> performance bottlenecks... > >> > >> I would definitely recommend investing in Camus instead, since it > already > >> covers a lot of the challenges I'm mentioning above, and also has more > >> community support behind it at the moment (as far as I can tell, > anyway), > >> so it is more likely to keep getting improvements than the contrib code. > >> > >> -- > >> Felix > >> > >> > >> On Thu, Aug 8, 2013 at 9:28 AM, <psaltis...@gmail.com> wrote: > >> > >>> We also have a need today to ETL from Kafka into Hadoop and we do not > >>> currently nor have any plans to use Avro. > >>> > >>> So is the official direction based on this discussion to ditch the > Kafka > >>> contrib code and direct people to use Camus without Avro as Ken > described > >>> or are both solutions going to survive? > >>> > >>> I can put time into the contrib code and/or work on documenting the > >>> tutorial on how to make Camus work without Avro. > >>> > >>> Which is the preferred route, for the long term? > >>> > >>> Thanks, > >>> Andrew > >>> > >>> On Wednesday, August 7, 2013 10:50:53 PM UTC-6, Ken Goodhope wrote: > >>> > Hi Andrew, > >>> > > >>> > > >>> > > >>> > Camus can be made to work without avro. You will need to implement a > >>> message decoder and and a data writer. We need to add a better > tutorial > >>> on how to do this, but it isn't that difficult. If you decide to go > down > >>> this path, you can always ask questions on this list. I try to make > sure > >>> each email gets answered. But it can take me a day or two. > >>> > > >>> > > >>> > > >>> > -Ken > >>> > > >>> > > >>> > > >>> > On Aug 7, 2013, at 9:33 AM, ao...@wikimedia.org wrote: > >>> > > >>> > > >>> > > >>> > > Hi all, > >>> > > >>> > > > >>> > > >>> > > Over at the Wikimedia Foundation, we're trying to figure out the > >>> best way to do our ETL from Kafka into Hadoop. We don't currently use > Avro > >>> and I'm not sure if we are going to. I came across this post. > >>> > > >>> > > > >>> > > >>> > > If the plan is to remove the hadoop-consumer from Kafka contrib, do > >>> you think we should not consider it as one of our viable options? > >>> > > >>> > > > >>> > > >>> > > Thanks! > >>> > > >>> > > -Andrew > >>> > > >>> > > > >>> > > >>> > > -- > >>> > > >>> > > You received this message because you are subscribed to the Google > >>> Groups "Camus - Kafka ETL for Hadoop" group. > >>> > > >>> > > To unsubscribe from this group and stop receiving emails from it, > >>> send an email to camus_etl+...@**googlegroups.com. > >>> > >>> > > >>> > > For more options, visit https://groups.google.com/**groups/opt_out > <https://groups.google.com/groups/opt_out> > >>> . > >>> > > >>> > > > >>> > > >>> > > > >>> > >>> > >> >