Hi Zhan,
I applied the patch you recommended,
https://github.com/apache/spark/pull/3409, it it now works. It was failing
with this:
Exception message:
/hadoop/yarn/local/usercache/root/appcache/application_1425078697953_0020/container_1425078697953_0020_01_000002/launch_container.sh:
line 14:
$PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/
*${hdp.version}*/hadoop/lib/hadoop-lzo-0.6.0.*${hdp.version}*.jar:/etc/hadoop/conf/secure:$PWD/__app__.jar:$PWD/*:
*bad substitution*
While the spark-default.conf has these defined:
spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0-2041
spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0-2041
without the patch *${hdp.version} * was not being substituted.
Thanks for pointing me to that patch, appreciate it.
-Todd
On Fri, Mar 6, 2015 at 1:12 PM, Zhan Zhang <[email protected]> wrote:
> Hi Todd,
>
> Looks like the thrift server can connect to metastore, but something
> wrong in the executors. You can try to get the log with "yarn logs
> -applicationID xxx” to check why it failed. If there is no log (master or
> executor is not started at all), you can go to the RM webpage, click the
> link to see why the shell failed in the first place.
>
> Thanks.
>
> Zhan Zhang
>
> On Mar 6, 2015, at 9:59 AM, Todd Nist <[email protected]> wrote:
>
> First, thanks to everyone for their assistance and recommendations.
>
> @Marcelo
>
> I applied the patch that you recommended and am now able to get into the
> shell, thank you worked great after I realized that the pom was pointing to
> the 1.3.0-SNAPSHOT for parent, need to bump that down to 1.2.1.
>
> @Zhan
>
> Need to apply this patch next. I tried to start the spark-thriftserver
> but and it starts, then fails with like this: I have the entries in my
> spark-default.conf, but not the patch applied.
>
> ./sbin/start-thriftserver.sh --master yarn --executor-memory 1024m
> --hiveconf hive.server2.thrift.port=10001
>
> 5/03/06 12:34:17 INFO ui.SparkUI: Started SparkUI at
> http://hadoopdev01.opsdatastore.com:404015/03/06 12:34:18 INFO
> impl.TimelineClientImpl: Timeline service address:
> http://hadoopdev02.opsdatastore.com:8188/ws/v1/timeline/15/03/06 12:34:18
> INFO client.RMProxy: Connecting to ResourceManager at
> hadoopdev02.opsdatastore.com/192.168.15.154:805015/03/06 12:34:18 INFO
> yarn.Client: Requesting a new application from cluster with 4
> NodeManagers15/03/06 12:34:18 INFO yarn.Client: Verifying our application has
> not requested more than the maximum memory capability of the cluster (8192 MB
> per container)15/03/06 12:34:18 INFO yarn.Client: Will allocate AM container,
> with 896 MB memory including 384 MB overhead15/03/06 12:34:18 INFO
> yarn.Client: Setting up container launch context for our AM15/03/06 12:34:18
> INFO yarn.Client: Preparing resources for our AM container15/03/06 12:34:19
> WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature
> cannot be used because libhadoop cannot be loaded.15/03/06 12:34:19 INFO
> yarn.Client: Uploading resource
> file:/root/spark-1.2.1-bin-hadoop2.6/lib/spark-assembly-1.2.1-hadoop2.6.0.jar
> ->
> hdfs://hadoopdev01.opsdatastore.com:8020/user/root/.sparkStaging/application_1425078697953_0018/spark-assembly-1.2.1-hadoop2.6.0.jar15/03/06
> 12:34:21 INFO yarn.Client: Setting up the launch environment for our AM
> container15/03/06 12:34:21 INFO spark.SecurityManager: Changing view acls to:
> root15/03/06 12:34:21 INFO spark.SecurityManager: Changing modify acls to:
> root15/03/06 12:34:21 INFO spark.SecurityManager: SecurityManager:
> authentication disabled; ui acls disabled; users with view permissions:
> Set(root); users with modify permissions: Set(root)15/03/06 12:34:21 INFO
> yarn.Client: Submitting application 18 to ResourceManager15/03/06 12:34:21
> INFO impl.YarnClientImpl: Submitted application
> application_1425078697953_001815/03/06 12:34:22 INFO yarn.Client: Application
> report for application_1425078697953_0018 (state: ACCEPTED)15/03/06 12:34:22
> INFO yarn.Client:
> client token: N/A
> diagnostics: N/A
> ApplicationMaster host: N/A
> ApplicationMaster RPC port: -1
> queue: default
> start time: 1425663261755
> final status: UNDEFINED
> tracking URL:
> http://hadoopdev02.opsdatastore.com:8088/proxy/application_1425078697953_0018/
> user: root15/03/06 12:34:23 INFO yarn.Client: Application report for
> application_1425078697953_0018 (state: ACCEPTED)15/03/06 12:34:24 INFO
> yarn.Client: Application report for application_1425078697953_0018 (state:
> ACCEPTED)15/03/06 12:34:25 INFO yarn.Client: Application report for
> application_1425078697953_0018 (state: ACCEPTED)15/03/06 12:34:26 INFO
> yarn.Client: Application report for application_1425078697953_0018 (state:
> ACCEPTED)15/03/06 12:34:27 INFO cluster.YarnClientSchedulerBackend:
> ApplicationMaster registered as
> Actor[akka.tcp://[email protected]:40201/user/YarnAM#-557112763]15/03/06
> 12:34:27 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter.
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS
> -> hadoopdev02.opsdatastore.com, PROXY_URI_BASES ->
> http://hadoopdev02.opsdatastore.com:8088/proxy/application_1425078697953_0018),
> /proxy/application_1425078697953_001815/03/06 12:34:27 INFO ui.JettyUtils:
> Adding filter:
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter15/03/06 12:34:27
> INFO yarn.Client: Application report for application_1425078697953_0018
> (state: RUNNING)15/03/06 12:34:27 INFO yarn.Client:
> client token: N/A
> diagnostics: N/A
> ApplicationMaster host: hadoopdev08.opsdatastore.com
> ApplicationMaster RPC port: 0
> queue: default
> start time: 1425663261755
> final status: UNDEFINED
> tracking URL:
> http://hadoopdev02.opsdatastore.com:8088/proxy/application_1425078697953_0018/
> user: root15/03/06 12:34:27 INFO cluster.YarnClientSchedulerBackend:
> Application application_1425078697953_0018 has started running.15/03/06
> 12:34:28 INFO netty.NettyBlockTransferService: Server created on
> 4612415/03/06 12:34:28 INFO storage.BlockManagerMaster: Trying to register
> BlockManager15/03/06 12:34:28 INFO storage.BlockManagerMasterActor:
> Registering block manager hadoopdev01.opsdatastore.com:46124 with 265.4 MB
> RAM, BlockManagerId(<driver>, hadoopdev01.opsdatastore.com, 46124)15/03/06
> 12:34:28 INFO storage.BlockManagerMaster: Registered BlockManager15/03/06
> 12:34:47 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready
> for scheduling beginning after waiting maxRegisteredResourcesWaitingTime:
> 30000(ms)15/03/06 12:34:48 INFO hive.metastore: Trying to connect to
> metastore with URI thrift://hadoopdev03.opsdatastore.com:908315/03/06
> 12:34:48 INFO hive.metastore: Connected to metastore.15/03/06 12:34:49 INFO
> session.SessionState: No Tez session required at this point.
> hive.execution.engine=mr.15/03/06 12:34:49 INFO service.AbstractService:
> HiveServer2: Async execution pool size 10015/03/06 12:34:49 INFO
> service.AbstractService: Service:OperationManager is inited.15/03/06 12:34:49
> INFO service.AbstractService: Service: SessionManager is inited.15/03/06
> 12:34:49 INFO service.AbstractService: Service: CLIService is inited.15/03/06
> 12:34:49 INFO service.AbstractService: Service:ThriftBinaryCLIService is
> inited.15/03/06 12:34:49 INFO service.AbstractService: Service: HiveServer2
> is inited.15/03/06 12:34:49 INFO service.AbstractService:
> Service:OperationManager is started.15/03/06 12:34:49 INFO
> service.AbstractService: Service:SessionManager is started.15/03/06 12:34:49
> INFO service.AbstractService: Service:CLIService is started.15/03/06 12:34:49
> INFO hive.metastore: Trying to connect to metastore with URI
> thrift://hadoopdev03.opsdatastore.com:908315/03/06 12:34:49 INFO
> hive.metastore: Connected to metastore.15/03/06 12:34:49 INFO
> service.AbstractService: Service:ThriftBinaryCLIService is started.15/03/06
> 12:34:49 INFO service.AbstractService: Service:HiveServer2 is
> started.15/03/06 12:34:49 INFO thriftserver.HiveThriftServer2:
> HiveThriftServer2 started15/03/06 12:34:49 INFO thrift.ThriftCLIService:
> ThriftBinaryCLIService listening on 0.0.0.0/0.0.0.0:1000115/03/06 12:34:58
> WARN remote.ReliableDeliverySupervisor: Association with remote system
> [akka.tcp://[email protected]:40201] has failed,
> address is now gated for [5000] ms. Reason is: [Disassociated].15/03/06
> 12:35:02 INFO cluster.YarnClientSchedulerBackend: ApplicationMaster
> registered as
> Actor[akka.tcp://[email protected]:53176/user/YarnAM#-1793579186]15/03/06
> 12:35:02 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter.
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS
> -> hadoopdev02.opsdatastore.com, PROXY_URI_BASES ->
> http://hadoopdev02.opsdatastore.com:8088/proxy/application_1425078697953_0018),
> /proxy/application_1425078697953_001815/03/06 12:35:02 INFO ui.JettyUtils:
> Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
> 15/03/06 12:35:38 WARN remote.ReliableDeliverySupervisor: Association with
> remote system [akka.tcp://[email protected]:53176] has
> failed, address is now gated for [5000] ms. Reason is:
> [Disassociated].15/03/06 12:35:39 ERROR cluster.YarnClientSchedulerBackend:
> Yarn application has already exited with state FINISHED!15/03/06 12:35:39
> INFO handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/metrics/json,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/stages/stage/kill,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/,null}15/03/06
> 12:35:39 INFO handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/static,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/executors/threadDump/json,null}15/03/06
> 12:35:39 INFO handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/executors/threadDump,null}15/03/06 12:35:39
> INFO handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/executors/json,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/executors,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/environment/json,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/environment,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/storage/rdd/json,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/storage/rdd,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/storage/json,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/storage,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/stages/pool/json,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/stages/pool,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/stages/stage/json,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/stages/stage,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/stages/json,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/stages,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/jobs/job/json,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/jobs/job,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/jobs/json,null}15/03/06 12:35:39 INFO
> handler.ContextHandler: stopped
> o.e.j.s.ServletContextHandler{/jobs,null}15/03/06 12:35:39 INFO ui.SparkUI:
> Stopped Spark web UI at http://hadoopdev01.opsdatastore.com:404015/03/06
> 12:35:39 INFO scheduler.DAGScheduler: Stopping DAGScheduler15/03/06 12:35:39
> INFO cluster.YarnClientSchedulerBackend: Shutting down all executors15/03/06
> 12:35:39 INFO cluster.YarnClientSchedulerBackend: Asking each executor to
> shut down15/03/06 12:35:39 INFO cluster.YarnClientSchedulerBackend:
> Stopped15/03/06 12:35:40 INFO spark.MapOutputTrackerMasterActor:
> MapOutputTrackerActor stopped!15/03/06 12:35:40 INFO storage.MemoryStore:
> MemoryStore cleared15/03/06 12:35:40 INFO storage.BlockManager: BlockManager
> stopped15/03/06 12:35:40 INFO storage.BlockManagerMaster: BlockManagerMaster
> stopped15/03/06 12:35:40 INFO spark.SparkContext: Successfully stopped
> SparkContext15/03/06 12:35:40 INFO
> remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote
> daemon.15/03/06 12:35:40 INFO
> remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down;
> proceeding with flushing remote transports.15/03/06 12:35:40 INFO
> remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
>
>
>
> Thanks again for the help.
>
> -Todd
>
> On Thu, Mar 5, 2015 at 7:06 PM, Zhan Zhang <[email protected]> wrote:
>
>> In addition, you may need following patch if it is not in 1.2.1 to solve
>> some system property issue if you use HDP 2.2.
>>
>> https://github.com/apache/spark/pull/3409
>>
>> You can follow the following link to set hdp.version for java options.
>>
>> http://hortonworks.com/hadoop-tutorial/using-apache-spark-hdp/
>>
>> Thanks.
>>
>> Zhan Zhang
>>
>> On Mar 5, 2015, at 11:09 AM, Marcelo Vanzin <[email protected]>
>> wrote:
>>
>> It seems from the excerpt below that your cluster is set up to use the
>> Yarn ATS, and the code is failing in that path. I think you'll need to
>> apply the following patch to your Spark sources if you want this to
>> work:
>>
>> https://github.com/apache/spark/pull/3938
>>
>> On Thu, Mar 5, 2015 at 10:04 AM, Todd Nist <[email protected]> wrote:
>>
>>
>> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:166)
>> at
>> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>> at
>> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:65)
>> at
>>
>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
>> at
>>
>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:348)
>>
>>
>> --
>> Marcelo
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>>
>
>