Re: welcome a new batch of committers
Congratulations Ishizaki-san..Thanks,Madhu._-Denny Leewrote: -To: Dongjin Lee From: Denny Lee Date: 10/03/2018 06:31PMCc: dev Subject: Re: welcome a new batch of committersCongratulations! On Wed, Oct 3, 2018 at 05:26 Dongjin Lee wrote:Congratulations to ALL!!- DongjinOn Wed, Oct 3, 2018 at 7:48 PM Jack Kolokasis wrote: Congratulations to all !!-IacovosOn 03/10/2018 12:54 μμ, Ted Yu wrote: Congratulations to all ! Original message From: Jungtaek Lim Date: 10/3/18 2:41 AM (GMT-08:00) To: Marco Gaido Cc: dev Subject: Re: welcome a new batch of committers Congrats all! You all deserved it. On Wed, 3 Oct 2018 at 6:35 PM Marco Gaido wrote: Congrats you all! Il giorno mer 3 ott 2018 alle ore 11:29 Liang-Chi Hsieh ha scritto: Congratulations to all new committers! rxin wrote > Hi all, > > The Apache Spark PMC has recently voted to add several new committers to > the project, for their contributions: > > - Shane Knapp (contributor to infra) > - Dongjoon Hyun (contributor to ORC support and other parts of Spark) > - Kazuaki Ishizaki (contributor to Spark SQL) > - Xingbo Jiang (contributor to Spark Core and SQL) > - Yinan Li (contributor to Spark on Kubernetes) > - Takeshi Yamamuro (contributor to Spark SQL) > > Please join me in welcoming them!-- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org -- Iacovos KolokasisEmail: koloka...@ics.forth.gr Postgraduate Student CSD, University of CreteResearcher in CARV Lab ICS FORTH -- Dongjin LeeA hitchhiker in the mathematical world.github: github.com/dongjinleekrlinkedin: kr.linkedin.com/in/dongjinleekrslideshare: www.slideshare.net/dongjinleekr - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: Unable to run the spark application in standalone cluster mode
Try Increasing the spark worker memory in conf/spark-env.sh export SPARK_WORKER_MEMORY=2g Thanks, Madhu. Ratika Prasad To "dev@spark.apache.org" 08/19/2015 09:22 PM cc Subject Unable to run the spark application in standalone cluster mode Hi , We have a simple spark application which is running through when run locally on master node as below ./bin/spark-submit --class com.coupons.salestransactionprocessor.SalesTransactionDataPointCreation --master local sales-transaction-processor-0.0.1-SNAPSHOT-jar-with-dependencies.jar But however I try to run it in cluster mode [ our spark cluster has two nodes one master and one slave with executer memory of 512MB], the application fails with the below, Pls provide some inputs as to why? 15/08/19 15:37:52 INFO client.AppClient$ClientActor: Executor updated: app-20150819153234-0001/8 is now RUNNING 15/08/19 15:37:56 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 15/08/19 15:38:11 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 15/08/19 15:38:26 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 15/08/19 15:38:32 INFO client.AppClient$ClientActor: Executor updated: app-20150819153234-0001/8 is now EXITED (Command exited with code 1) 15/08/19 15:38:32 INFO cluster.SparkDeploySchedulerBackend: Executor app-20150819153234-0001/8 removed: Command exited with code 1 15/08/19 15:38:32 INFO client.AppClient$ClientActor: Executor added: app-20150819153234-0001/9 on worker-20150812111932-ip-172-28-161-173.us-west-2.compute.internal-50108 (ip-172-28-161-173.us-west-2.compute.internal:50108) with 1 cores 15/08/19 15:38:32 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20150819153234-0001/9 on hostPort ip-172-28-161-173.us-west-2.compute.internal:50108 with 1 cores, 512.0 MB RAM 15/08/19 15:38:32 INFO client.AppClient$ClientActor: Executor updated: app-20150819153234-0001/9 is now RUNNING 15/08/19 15:38:41 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 15/08/19 15:38:56 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 15/08/19 15:39:11 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 15/08/19 15:39:12 INFO client.AppClient$ClientActor: Executor updated: app-20150819153234-0001/9 is now EXITED (Command exited with code 1) 15/08/19 15:39:12 INFO cluster.SparkDeploySchedulerBackend: Executor app-20150819153234-0001/9 removed: Command exited with code 1 15/08/19 15:39:12 ERROR cluster.SparkDeploySchedulerBackend: Application has been killed. Reason: Master removed our application: FAILED 15/08/19 15:39:12 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 15/08/19 15:39:12 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/metrics/json,null} 15/08/19 15:39:12 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/kill,null} 15/08/19 15:39:12 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/,null} 15/08/19 15:39:12 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/static,null} 15/08/19 15:39:12 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/json,null} 15/08/19 15:39:12 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors,null} 15/08/19 15:39:12 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler
RE: Unable to run the spark application in standalone cluster mode
Slave nodes.. Thanks, Madhu. Ratika Prasad To Madhusudanan 08/19/2015 09:33 Kandasamy/India/IBM@IBMIN PM cc "dev@spark.apache.org" Subject RE: Unable to run the spark application in standalone cluster mode Should this be done on master or slave node or both ? From: Madhusudanan Kandasamy [mailto:madhusuda...@in.ibm.com] Sent: Wednesday, August 19, 2015 9:31 PM To: Ratika Prasad Cc: dev@spark.apache.org Subject: Re: Unable to run the spark application in standalone cluster mode Try Increasing the spark worker memory in conf/spark-env.sh export SPARK_WORKER_MEMORY=2g Thanks, Madhu. Inactive hide details for Ratika Prasad ---08/19/2015 09:22:37 PM---Ratika Prasad Ratika Prasad ---08/19/2015 09:22:37 PM---Ratika Prasad Ratika Prasad < rpra...@couponsinc.com> To 08/19/2015 09:22 PM "dev@spark.apache.org " < dev@spark.apache.org> cc Subject Unable to run the spark application in standalone cluster mode Hi , We have a simple spark application which is running through when run locally on master node as below ./bin/spark-submit --class com.coupons.salestransactionprocessor.SalesTransactionDataPointCreation --master local sales-transaction-processor-0.0.1-SNAPSHOT-jar-with-dependencies.jar But however I try to run it in cluster mode [ our spark cluster has two nodes one master and one slave with executer memory of 512MB], the application fails with the below, Pls provide some inputs as to why? 15/08/19 15:37:52 INFO client.AppClient$ClientActor: Executor updated: app-20150819153234-0001/8 is now RUNNING 15/08/19 15:37:56 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 15/08/19 15:38:11 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers ar
Question on DAGScheduler.getMissingParentStages()
Hi, I'm new to SPARK, trying to understand the DAGScheduler code flow. As per my understanding it looks like getMissingParentStages() doing a redundant job of re-calculating stage dependencies. When the first stage is created all of its dependent/parent stages would be recursively calculated and stored in stage.parents member. Whenever any given stage needs to be submitted, it would call getMissingParentStages() to get list of all un-computed parent stages. I've expected that getMissingParentStages() would go through stage.parents and retrieve information about whether they are already computed or not. However, this function does another graph traversal from the stage.rdd which seems unnecessary. Is there any specific reason to design like that? If not, I would like to redesign getMissingParentStages() avoiding the graph traversal. Thanks, Madhu.