Because we need to do exploratory data analysis and machine learning. We need to backup the messages somewhere so that the data scientists can query/load them.
So we need something like a router that just opens up a new consumer group which just keeps on storing them to S3. On Tue, Dec 6, 2016 at 5:05 PM, Sharninder Khera <> wrote: > Why not just have a parallel consumer read all messages from whichever > topics you're interested in and store them wherever you want to? You don't > need to "backup" Kafka messages. > > _____________________________ > From: Aseem Bansal <> > Sent: Tuesday, December 6, 2016 4:55 PM > Subject: Storing Kafka Message JSON to deep storage like S3 > To: <> > > > Hi > > Has anyone done a storage of Kafka JSON messages to deep storage like S3. > We are looking to back up all of our raw Kafka JSON messages for > Exploration. S3, HDFS, MongoDB come to mind initially. > > I know that it can be stored in kafka itself but storing them in Kafka > itself does not seem like a good option as we won't be able to query it and > the configurations of machines containing kafka will have to be increased > as we go. Something like S3 we won't have to manage. > > > > >