I was getting a warning about /tmp/hive not being writable whenever I started spark-shell, but I was ignoring it. I decided to set the permissions to 777 and restart the shell. After doing that, I now get the same result as Ted Yu when running Seq(1,2).toDS.map(t => Test(t)).show.
On Wed, Jun 1, 2016 at 9:05 AM Tim Gautier <tim.gaut...@gmail.com> wrote: > I spun up another EC2 cluster today with Spark 1.6.1 and I still get the > error. > > scala> case class Test(a: Int) > defined class Test > > scala> Seq(1,2).toDS.map(t => Test(t)).show > 16/06/01 15:04:21 WARN scheduler.TaskSetManager: Lost task 39.0 in stage > 0.0 (TID 39, ip-10-2-2-203.us-west-2.compute.internal): > java.lang.NoClassDefFoundError: Could not initialize class $line29.$read$ > at > $line33.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:35) > at > $line33.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:35) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$$anon$10.next(Iterator.scala:312) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:149) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > 16/06/01 15:04:21 INFO scheduler.TaskSetManager: Starting task 39.1 in > stage 0.0 (TID 40, ip-10-2-2-111.us-west-2.compute.internal, partition > 39,PROCESS_LOCAL, 2386 bytes) > 16/06/01 15:04:21 WARN scheduler.TaskSetManager: Lost task 19.0 in stage > 0.0 (TID 19, ip-10-2-2-203.us-west-2.compute.internal): > java.lang.ExceptionInInitializerError > at $line29.$read$$iwC.<init>(<console>:7) > at $line29.$read.<init>(<console>:24) > at $line29.$read$.<init>(<console>:28) > at $line29.$read$.<clinit>(<console>) > at > $line33.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:35) > at > $line33.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:35) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$$anon$10.next(Iterator.scala:312) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:149) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at $line3.$read$$iwC$$iwC.<init>(<console>:15) > at $line3.$read$$iwC.<init>(<console>:24) > at $line3.$read.<init>(<console>:26) > at $line3.$read$.<init>(<console>:30) > at $line3.$read$.<clinit>(<console>) > ... 18 more > > > On Tue, May 31, 2016 at 8:48 PM Tim Gautier <tim.gaut...@gmail.com> wrote: > >> That's really odd. I copied that code directly out of the shell and it >> errored out on me, several times. I wonder if something I did previously >> caused some instability. I'll see if it happens again tomorrow. >> >> On Tue, May 31, 2016, 8:37 PM Ted Yu <yuzhih...@gmail.com> wrote: >> >>> Using spark-shell of 1.6.1 : >>> >>> scala> case class Test(a: Int) >>> defined class Test >>> >>> scala> Seq(1,2).toDS.map(t => Test(t)).show >>> +---+ >>> | a| >>> +---+ >>> | 1| >>> | 2| >>> +---+ >>> >>> FYI >>> >>> On Tue, May 31, 2016 at 7:35 PM, Tim Gautier <tim.gaut...@gmail.com> >>> wrote: >>> >>>> 1.6.1 The exception is a null pointer exception. I'll paste the whole >>>> thing after I fire my cluster up again tomorrow. >>>> >>>> I take it by the responses that this is supposed to work? >>>> >>>> Anyone know when the next version is coming out? I keep running into >>>> bugs with 1.6.1 that are hindering my progress. >>>> >>>> On Tue, May 31, 2016, 8:21 PM Saisai Shao <sai.sai.s...@gmail.com> >>>> wrote: >>>> >>>>> It works fine in my local test, I'm using latest master, maybe this >>>>> bug is already fixed. >>>>> >>>>> On Wed, Jun 1, 2016 at 7:29 AM, Michael Armbrust < >>>>> mich...@databricks.com> wrote: >>>>> >>>>>> Version of Spark? What is the exception? >>>>>> >>>>>> On Tue, May 31, 2016 at 4:17 PM, Tim Gautier <tim.gaut...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> How should I go about mapping from say a Dataset[(Int,Int)] to a >>>>>>> Dataset[<case class here>]? >>>>>>> >>>>>>> I tried to use a map, but it throws exceptions: >>>>>>> >>>>>>> case class Test(a: Int) >>>>>>> Seq(1,2).toDS.map(t => Test(t)).show >>>>>>> >>>>>>> Thanks, >>>>>>> Tim >>>>>>> >>>>>> >>>>>> >>>>> >>>