Another alternative is to checkout Kaboom https://github.com/blackberry/KaBoom
It uses a pared down kafka consumer library to pull data from Kafka and write it to defined (and somewhat dynamic) hdfs paths in a custom (and changeable) avro schema we call boom. It uses kerberos for authentication, and supports very high throughout. It's still actively being developed, with a new release coming soon with enhanced configuration through a new rest api (kontroller). Cheers Todd. Sent from my BlackBerry 10 smartphone on the TELUS network. Original Message From: Guozhang Wang Sent: Thursday, October 22, 2015 5:03 PM To: users@kafka.apache.org Reply To: users@kafka.apache.org Subject: Re: future of Camus? Hi Adrian, Another alternative approach is to use Kafka's own Copycat framework for data ingressing / egressing. It will be released in our 0.9.0 version expected in Nov. Under Copycat users can write different "connector" instantiated for different source / sink systems, while for your case there is a in-built HDFS connector coming along with the framework itself. You can find more details in these Kafka wikis / java docs: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=58851767 https://s3-us-west-2.amazonaws.com/confluent-files/copycat-docs-wip/intro.html Guozhang On Thu, Oct 22, 2015 at 12:52 PM, Henry Cai <h...@pinterest.com.invalid> wrote: > Take a look at secor: > > https://github.com/pinterest/secor > > Secor is a no-frill kafka->HDFS/Ingesting tool, doesn't depend on any > underlying systems such as Hadoop, it only uses Kafka high level consumer > to balance the work loads. Very easy to understand and manage. It's > probably the 2nd most popular kafka/HDFS ingestion tool (behind camus). > Lots of web companies use this to do the kafka data ingestion > (Pinterest/Uber/AirBnb). > > > On Thu, Oct 22, 2015 at 3:56 AM, Adrian Woodhead <awoodh...@hotels.com> > wrote: > > > Hello all, > > > > We're looking at options for getting data from Kafka onto HDFS and Camus > > looks like the natural choice for this. It's also evident that LinkedIn > who > > originally created Camus are taking things in a different direction and > are > > advising people to use their Gobblin ETL framework instead. We feel that > > Gobblin is overkill for many simple use cases and Camus seems a much > > simpler and better fit. The problem now is that with LinkedIn apparently > > withdrawing official support for it it appears that any changes to Camus > > are being managed by various forks of it and it looks like everyone is > > building and using their own versions. Wouldn't it be better for a > > community to form around one official fork so development efforts can be > > focused on this? Any thoughts on this? > > > > Thanks, > > > > Adrian > > > > > -- -- Guozhang