Hi,
We have reference data pulled in from an RDBMS through a Sqoop job, this
reference data is pulled into the Analytics platform once a day.
We have a Spark Streaming job, where at job bootup we read the reference
data, and then join this reference data with continuously flowing event
data. When the reference data gets updated once a day, how do I make sure
the Spark Streaming job uses the newly updated reference data?
One simple way is to bounce the Spark Streaming job once a day after new
Reference data is imported but is there a better & less-destructive
approach?
Thanks & Regards
MK

Reply via email to