from:"Ben Vogan"

Re: Replicating Cassandra data to HDFS

2016-08-09 Thread Ben Vogan

; for duplication checks to dedup then output to another source (form of dual > write but with dedup), this was really silly and slow. I only bring it up > to save you the trouble in case you end up in the same path chasing for > something more 'real time'. > > Regards, > R

Replicating Cassandra data to HDFS

2016-08-09 Thread Ben Vogan

Hi all, We are investigating using Cassandra in our data platform. We would like data to go into Cassandra first and to eventually be replicated into our data lake in HDFS for long term cold storage. Does anyone know of a good way of doing this? We would rather not have parallel writes to HDFS