RE: Is Spark suited for this use case?

van den Heever, Christian CC Sun, 15 Oct 2017 21:33:48 -0700

Hi,

We basically have the same scenario but worldwide as we have bigger Datasets we 
use OGG --> local --> Sqoop Into Hadoop.
By all means you can have spark reading the oracle tables and then do some 
changes to data in need which will not be done on scoop qry. Ie fraudulent 
detection on transaction records.


But some time the simplest way is the best. Unless you need a change or need 
more then I would advise not using another hop.
I would rather move away from files as OGG can do files and direct table 
loading then sqoop for the rest.

Simpler is better.

Hope this helps.
C.

From: Saravanan Thirumalai [mailto:saravanan.thiruma...@gmail.com]
Sent: Monday, 16 October 2017 4:29 AM
To: user@spark.apache.org
Subject: Is Spark suited for this use case?

We are an Investment firm and have a MDM platform in oracle at a vendor 
location and use Oracle Golden Gate to replicat data to our data center for 
reporting needs.
Our data is not big data (total size 6 TB including 2 TB of archive data). 
Moreover our data doesn't get updated often, nightly once (around 50 MB) and 
some correction transactions during the day (<10 MB). We don't have external 
users and hence data doesn't grow real-time like e-commerce.

When we replicate data from source to target, we transfer data through files. 
So, if there are DML operations (corrections) during day time on a source 
table, the corresponding file would have probably 100 lines of table data that 
needs to be loaded into the target database. Due to low volume of data we 
designed this through Informatica and this works in less than 2-5 minutes. Can 
Spark be used in this case or would it be an overkill of technology use?



Standard Bank email disclaimer and confidentiality note
Please go to www.standardbank.co.za/site/homepage/emaildisclaimer.html to read 
our email disclaimer and confidentiality note. Kindly email 
disclai...@standardbank.co.za (no content or subject line necessary) if you 
cannot view that page and we will email our email disclaimer and 
confidentiality note to you.

RE: Is Spark suited for this use case?

Reply via email to