Hi,
I need to run a batch job written in Java that executes several SQL statements
on different hive tables, and then process each partition result set in a
foreachPartition() operator.
I'd like to run these actions in parallel.
I saw there are two approaches for achieving this:
1. Using
Hi,
I'm a newbie to spark, starting to work with Spark 1.5 using the Java API
(about to upgrade to 1.6 soon).
I am deploying a spark streaming application using spark-submit with
yarn-cluster mode.
What is the recommended way for performing graceful shutdown to the spark job?
Already tried usin