Hi,

I have implemented once one way replication from a RDBMS to Cassandra using 
triggers in the source database side. If you timestamp the changes from the 
source, it’s possible to timestamp them on the cassandra side as well and that 
takes care of a lot of ordering of the changes. Assuming that your data model 
doesn’t change too much.

In practise:
- Triggers push change events to a commit log and that is pushed to a queue
- Readers on Cassandra side reads to events from the queue and write them in 
cassandra with the timestamp from the change event
- Cassandra handles ordering of change events

Using timestamps you can resend changes, read in events in any order, etc. If 
you screw up the replication somehow (we did many times), it was easy to just 
create a dump on the source and load that in again with timestamps so that the 
system was running all the time.

This way it’s possible to achieve quite low latency (seconds, not minutes) for 
the replication.

Cheers,
Hannu

> On 02 Mar 2016, at 03:11, anil_ah <anil...@yahoo.co.in> wrote:
> 
> Hi 
>    I want to run spark job to do incremental sync from oracle to 
> cassandra,job interval could be one minute.we are looking for a real time 
> replication with latency of 1 or 2 min.
> 
> Please advise  what would be best Approch
> 
> 1)oracle db->spark sql ->spark->cassandra.
> 2)oracle db ->sqoop->cassandra 
> 
> Please advise which option is good in term of scalable,incremental etc
> 
> Regards 
> Anil
> 
> 
> 
> Sent from my Samsung device

Reply via email to