Hi,

 

I'm running a spark application with YARN-client or YARN-cluster mode.

But it seems to take too long to startup.

It takes 10+ seconds to initialize the spark context.

Is this normal? Or can it be optimized?

 

The environment is as follows:

- Hadoop: Hortonworks HDP 2.2 (Hadoop 2.6)

- Spark: 1.3.1

- Client: Windows 7, but similar result on CentOS 6.6

 

The following is the startup part of the application log. (Some private
information was edited)

'Main: Initializing context' at the first line and 'MainProcessor: Deleting
previous output files' at the last line are the logs by the application.
Others in between are from Spark itself. Application logic is executed after
this log is displayed.

 

---

 

15/05/07 09:18:31 INFO Main: Initializing context

15/05/07 09:18:31 INFO SparkContext: Running Spark version 1.3.1

15/05/07 09:18:31 INFO SecurityManager: Changing view acls to: myuser,myapp

15/05/07 09:18:31 INFO SecurityManager: Changing modify acls to:
myuser,myapp

15/05/07 09:18:31 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(myuser, myapp);
users with modify permissions: Set(myuser, myapp)

15/05/07 09:18:31 INFO Slf4jLogger: Slf4jLogger started

15/05/07 09:18:31 INFO Remoting: Starting remoting

15/05/07 09:18:31 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://sparkDriver@mymachine:54449]

15/05/07 09:18:31 INFO Utils: Successfully started service 'sparkDriver' on
port 54449.

15/05/07 09:18:31 INFO SparkEnv: Registering MapOutputTracker

15/05/07 09:18:32 INFO SparkEnv: Registering BlockManagerMaster

15/05/07 09:18:32 INFO DiskBlockManager: Created local directory at
C:\Users\myuser\AppData\Local\Temp\spark-2d3db9d6-ea78-438e-956f-be9c1dcf3a9
d\blockmgr-e9ade223-a4b8-4d9f-b038-efd66adf9772

15/05/07 09:18:32 INFO MemoryStore: MemoryStore started with capacity 1956.7
MB

15/05/07 09:18:32 INFO HttpFileServer: HTTP File server directory is
C:\Users\myuser\AppData\Local\Temp\spark-ff40d73b-e8ab-433e-88c4-35da27fb627
8\httpd-def9220f-ac3a-4dd2-9ac1-2c593b94b2d9

15/05/07 09:18:32 INFO HttpServer: Starting HTTP Server

15/05/07 09:18:32 INFO Server: jetty-8.y.z-SNAPSHOT

15/05/07 09:18:32 INFO AbstractConnector: Started
SocketConnector@0.0.0.0:54450

15/05/07 09:18:32 INFO Utils: Successfully started service 'HTTP file
server' on port 54450.

15/05/07 09:18:32 INFO SparkEnv: Registering OutputCommitCoordinator

15/05/07 09:18:32 INFO Server: jetty-8.y.z-SNAPSHOT

15/05/07 09:18:32 INFO AbstractConnector: Started
SelectChannelConnector@0.0.0.0:4040

15/05/07 09:18:32 INFO Utils: Successfully started service 'SparkUI' on port
4040.

15/05/07 09:18:32 INFO SparkUI: Started SparkUI at http://mymachine:4040

15/05/07 09:18:32 INFO SparkContext: Added JAR
file:/D:/Projects/MyApp/MyApp.jar at
http://10.111.111.199:54450/jars/MyApp.jar with timestamp 1430957912240

15/05/07 09:18:32 INFO RMProxy: Connecting to ResourceManager at
cluster01/10.111.111.11:8050

15/05/07 09:18:32 INFO Client: Requesting a new application from cluster
with 3 NodeManagers

15/05/07 09:18:32 INFO Client: Verifying our application has not requested
more than the maximum memory capability of the cluster (23040 MB per
container)

15/05/07 09:18:32 INFO Client: Will allocate AM container, with 896 MB
memory including 384 MB overhead

15/05/07 09:18:32 INFO Client: Setting up container launch context for our
AM

15/05/07 09:18:32 INFO Client: Preparing resources for our AM container

15/05/07 09:18:32 INFO Client: Source and destination file systems are the
same. Not copying
hdfs://cluster01/apps/spark/spark-assembly-1.3.1-hadoop2.6.0.jar

15/05/07 09:18:32 INFO Client: Setting up the launch environment for our AM
container

15/05/07 09:18:33 INFO SecurityManager: Changing view acls to: myuser,myapp

15/05/07 09:18:33 INFO SecurityManager: Changing modify acls to:
myuser,myapp

15/05/07 09:18:33 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(myuser, myapp);
users with modify permissions: Set(myuser, myapp)

15/05/07 09:18:33 INFO Client: Submitting application 2 to ResourceManager

15/05/07 09:18:33 INFO YarnClientImpl: Submitted application
application_1430956687773_0002

15/05/07 09:18:34 INFO Client: Application report for
application_1430956687773_0002 (state: ACCEPTED)

15/05/07 09:18:34 INFO Client: 

            client token: N/A

           diagnostics: N/A

           ApplicationMaster host: N/A

           ApplicationMaster RPC port: -1

           queue: default

           start time: 1430957906540

           final status: UNDEFINED

           tracking URL:
http://cluster01:8088/proxy/application_1430956687773_0002/

           user: myapp

15/05/07 09:18:35 INFO Client: Application report for
application_1430956687773_0002 (state: ACCEPTED)

15/05/07 09:18:36 INFO Client: Application report for
application_1430956687773_0002 (state: ACCEPTED)

15/05/07 09:18:37 INFO Client: Application report for
application_1430956687773_0002 (state: ACCEPTED)

15/05/07 09:18:37 INFO YarnClientSchedulerBackend: ApplicationMaster
registered as
Actor[akka.tcp://sparkYarnAM@cluster02:39698/user/YarnAM#-1579648782]

15/05/07 09:18:37 INFO YarnClientSchedulerBackend: Add WebUI Filter.
org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS
-> cluster01, PROXY_URI_BASES ->
http://cluster01:8088/proxy/application_1430956687773_0002),
/proxy/application_1430956687773_0002

15/05/07 09:18:37 INFO JettyUtils: Adding filter:
org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter

15/05/07 09:18:38 INFO Client: Application report for
application_1430956687773_0002 (state: RUNNING)

15/05/07 09:18:38 INFO Client: 

            client token: N/A

           diagnostics: N/A

           ApplicationMaster host: cluster02

           ApplicationMaster RPC port: 0

           queue: default

           start time: 1430957906540

           final status: UNDEFINED

           tracking URL:
http://cluster01:8088/proxy/application_1430956687773_0002/

           user: myapp

15/05/07 09:18:38 INFO YarnClientSchedulerBackend: Application
application_1430956687773_0002 has started running.

15/05/07 09:18:38 INFO NettyBlockTransferService: Server created on 54491

15/05/07 09:18:38 INFO BlockManagerMaster: Trying to register BlockManager

15/05/07 09:18:38 INFO BlockManagerMasterActor: Registering block manager
mymachine:54491 with 1956.7 MB RAM, BlockManagerId(<driver>, mymachine,
54491)

15/05/07 09:18:38 INFO BlockManagerMaster: Registered BlockManager

15/05/07 09:18:43 INFO YarnClientSchedulerBackend: Registered executor:
Actor[akka.tcp://sparkExecutor@cluster02:44996/user/Executor#-786778979]
with ID 1

15/05/07 09:18:43 INFO YarnClientSchedulerBackend: SchedulerBackend is ready
for scheduling beginning after reached minRegisteredResourcesRatio: 0.8

15/05/07 09:18:43 INFO MainProcessor: Deleting previous output files

 

Thanks.

Reply via email to