The job ended up running overnight with no progress. :-(
On Sat, Aug 16, 2014 at 12:16 AM, Jerry Ye <jerr...@gmail.com> wrote: > Hi Xiangrui, > I actually tried branch-1.1 and master and it resulted in the job being > stuck at the TaskSetManager: > 14/08/16 06:55:48 INFO scheduler.TaskSchedulerImpl: Adding task set 1.0 > with 2 tasks > 14/08/16 06:55:48 INFO scheduler.TaskSetManager: Starting task 1.0:0 as > TID 2 on executor 8: ip-10-226-199-225.us-west-2.compute.internal > (PROCESS_LOCAL) > 14/08/16 06:55:48 INFO scheduler.TaskSetManager: Serialized task 1.0:0 as > 28055875 bytes in 162 ms > 14/08/16 06:55:48 INFO scheduler.TaskSetManager: Starting task 1.0:1 as > TID 3 on executor 0: ip-10-249-53-62.us-west-2.compute.internal > (PROCESS_LOCAL) > 14/08/16 06:55:48 INFO scheduler.TaskSetManager: Serialized task 1.0:1 as > 28055875 bytes in 178 ms > > It's been 10 minutes with no progress on relatively small data. I'll let > it run overnight and update in the morning. Is there some place that I > should look to see what is happening? I tried to ssh into the executor and > look at /root/spark/logs but there wasn't anything informative there. > > I'm sure using CountByValue works fine but my use of a HashMap is only an > example. In my actual task, I'm loading a Trie data structure to perform > efficient string matching between a dataset of locations and strings > possibly containing mentions of locations. > > This seems like a common thing, to process input with a relatively memory > intensive object like a Trie. I hope I'm not missing something obvious. Do > you know of any example code like my use case? > > Thanks! > > - jerry > > > > > On Fri, Aug 15, 2014 at 10:02 PM, Xiangrui Meng <men...@gmail.com> wrote: > >> Just saw you used toArray on an RDD. That copies all data to the >> driver and it is deprecated. countByValue is what you need: >> >> val samples = sc.textFile("s3n://geonames") >> val counts = samples.countByValue() >> val result = samples.map(l => (l, counts.getOrElse(l, 0L)) >> >> Could you also try to use the latest branch-1.1 or master with the >> default akka.frameSize setting? The serialized task size should be >> small because we now use broadcast RDD objects. >> >> -Xiangrui >> >> On Fri, Aug 15, 2014 at 5:11 PM, jerryye <jerr...@gmail.com> wrote: >> > Hi Xiangrui, >> > You were right, I had to use --driver_memory instead of setting it in >> > spark-defaults.conf. >> > >> > However, now my just hangs with the following message: >> > 4/08/15 23:54:46 INFO scheduler.TaskSetManager: Serialized task 1.0:0 as >> > 29433434 bytes in 202 ms >> > 14/08/15 23:54:46 INFO scheduler.TaskSetManager: Starting task 1.0:1 as >> TID >> > 3 on executor 1: ip-10-226-198-31.us-west-2.compute.internal >> (PROCESS_LOCAL) >> > 14/08/15 23:54:46 INFO scheduler.TaskSetManager: Serialized task 1.0:1 >> as >> > 29433434 bytes in 203 ms >> > >> > Any ideas on where else to look? >> > >> > >> > On Fri, Aug 15, 2014 at 3:29 PM, Xiangrui Meng [via Apache Spark >> Developers >> > List] <ml-node+s1001551n7883...@n3.nabble.com> wrote: >> > >> >> Did you verify the driver memory in the Executor tab of the WebUI? I >> >> think you need `--driver-memory 8g` with spark-shell or spark-submit >> >> instead of setting it in spark-defaults.conf. >> >> >> >> On Fri, Aug 15, 2014 at 12:41 PM, jerryye <[hidden email] >> >> <http://user/SendEmail.jtp?type=node&node=7883&i=0>> wrote: >> >> >> >> > Setting spark.driver.memory has no effect. It's still hanging trying >> to >> >> > compute result.count when I'm sampling greater than 35% regardless of >> >> what >> >> > value of spark.driver.memory I'm setting. >> >> > >> >> > Here's my settings: >> >> > export SPARK_JAVA_OPTS="-Xms5g -Xmx10g -XX:MaxPermSize=10g" >> >> > export SPARK_MEM=10g >> >> > >> >> > in conf/spark-defaults: >> >> > spark.driver.memory 1500 >> >> > spark.serializer org.apache.spark.serializer.KryoSerializer >> >> > spark.kryoserializer.buffer.mb 500 >> >> > spark.executor.memory 58315m >> >> > spark.executor.extraLibraryPath /root/ephemeral-hdfs/lib/native/ >> >> > spark.executor.extraClassPath /root/ephemeral-hdfs/conf >> >> > >> >> > >> >> > >> >> > -- >> >> > View this message in context: >> >> >> http://apache-spark-developers-list.1001551.n3.nabble.com/spark-akka-frameSize-stalls-job-in-1-1-0-tp7865p7877.html >> >> >> >> > Sent from the Apache Spark Developers List mailing list archive at >> >> Nabble.com. >> >> > >> >> > --------------------------------------------------------------------- >> >> > To unsubscribe, e-mail: [hidden email] >> >> <http://user/SendEmail.jtp?type=node&node=7883&i=1> >> >> > For additional commands, e-mail: [hidden email] >> >> <http://user/SendEmail.jtp?type=node&node=7883&i=2> >> >> > >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: [hidden email] >> >> <http://user/SendEmail.jtp?type=node&node=7883&i=3> >> >> For additional commands, e-mail: [hidden email] >> >> <http://user/SendEmail.jtp?type=node&node=7883&i=4> >> >> >> >> >> >> >> >> ------------------------------ >> >> If you reply to this email, your message will be added to the >> discussion >> >> below: >> >> >> >> >> http://apache-spark-developers-list.1001551.n3.nabble.com/spark-akka-frameSize-stalls-job-in-1-1-0-tp7865p7883.html >> >> To start a new topic under Apache Spark Developers List, email >> >> ml-node+s1001551n1...@n3.nabble.com >> >> To unsubscribe from spark.akka.frameSize stalls job in 1.1.0, click >> here >> >> < >> http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=7865&code=amVycnl5ZUBnbWFpbC5jb218Nzg2NXwtNTI4OTc1MTAz >> > >> >> . >> >> NAML >> >> < >> http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml >> > >> >> >> > >> > >> > >> > >> > -- >> > View this message in context: >> http://apache-spark-developers-list.1001551.n3.nabble.com/spark-akka-frameSize-stalls-job-in-1-1-0-tp7865p7886.html >> > Sent from the Apache Spark Developers List mailing list archive at >> Nabble.com. >> > >