Hi folks,
We are using kafka + spark streaming in our data pipeline, but sometimes
we have to clean up checkpoint from hdfs before we restart spark streaming
application, otherwise the application fails to start.
That means we are losing data when we clean up checkpoint, is there a way
to read
Hi folks,
I am using Spark streaming, and I am not clear if there is smart way to
restart the app once it fails, currently we just have one cron job to check
if the job is running every 2 or 5 minutes and restart the app when
necessary.
According to spark streaming guide:
- *YARN* - Yarn sup