Re: Persist Queue On HDFS

2014-01-14 Thread Rob Withers
Nice idea, but different sort of animal. Going to HDFS is different. It requires aggregation of traffic, so there is the whole offset commit strategy concern. When pulling traffic for per message work, we commit after every pull, so exactly once. The tradeoff with aggregation is whether to a

Re: Persist Queue On HDFS

2014-01-14 Thread Jun Rao
The api in HDFS is quite different from what's in a regular POSIX file system. Thanks, Jun On Tue, Jan 14, 2014 at 1:16 PM, Blender Bl wrote: > > Hi, > > > My team trying to implement lambda architecture. > We need to stream all our new data though Kafka to storm, and HDFS. > > > As i see it

Re: Persist Queue On HDFS

2014-01-14 Thread Joe Stein
There is also hadoop contrib producer and consumer https://github.com/apache/kafka/tree/0.8/contrib for hdfs /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop

Persist Queue On HDFS

2014-01-14 Thread Blender Bl
Hi, My team trying to implement lambda architecture. We need to stream all our new data though Kafka to storm, and HDFS. As i see it were are two options: Using Camus - not very efficent Streaming via Storm - not very efficent Is it possible to persist the queue's files over the HDFS (with s