I am trying to submit a spark job through oozie, the job is marked
successful, but it does not do anything.
I have it working through spark_submit on the command line.
Also the problem may be around creating the spark context, because I added
logging before/after/finally creating the SparkContext, and I only see
"test 1" in the logs.
Anyone have a suggestion on debugging this? Or where I may be able to get
additional logs?
try {
this.log.warning("test 1")
sc = new SparkContext(conf)
this.log.warning("test 2")
} finally {
this.log.warning("test 3")
}
We are using Yarn/CDH5.5
When I compare the logs below is where they start to diverge.
==========Broken===================
server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started
[email protected]:33827
util.Utils (Logging.scala:logInfo(59)) - Successfully started service
'SparkUI' on port xxxxx.
ui.SparkUI (Logging.scala:logInfo(59)) - Started SparkUI at
http://xxxxxxxx:xxxx
cluster.YarnClusterScheduler (Logging.scala:logInfo(59)) - Created
YarnClusterScheduler
metrics.MetricsSystem (Logging.scala:logWarning(71)) - Using default name
DAGScheduler for source because spark.app.id is not set.
util.Utils (Logging.scala:logInfo(59)) - Successfully started service
'org.apache.spark.network.netty.NettyBlockTransferService' on port 16611.
netty.NettyBlockTransferService (Logging.scala:logInfo(59)) - Server
created on xxxxx
storage.BlockManager (Logging.scala:logInfo(59)) - external shuffle service
port = xxxx
storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Trying to register
BlockManager
==========Working===================
AbstractConnector: Started [email protected]:4040
Utils: Successfully started service 'SparkUI' on port 4040.
SparkUI: Started SparkUI at http://xxxxxxx:4040
SparkContext: Added JAR ...
SparkContext: Added JAR file:/xxxxxxx.jar at http://xxxxxxx.jar with
timestamp 1450320527893
MetricsSystem: Using default name DAGScheduler for source because
spark.app.id is not set.
ConfiguredRMFailoverProxyProvider: Failing over to rm238
Client: Requesting a new application from cluster with 25 NodeManagers
Client: Verifying our application has not requested more than the maximum
memory capability of the cluster (65536 MB per container)