I spun up another EC2 cluster today with Spark 1.6.1 and I still get the error.
scala> case class Test(a: Int) defined class Test scala> Seq(1,2).toDS.map(t => Test(t)).show 16/06/01 15:04:21 WARN scheduler.TaskSetManager: Lost task 39.0 in stage 0.0 (TID 39, ip-10-2-2-203.us-west-2.compute.internal): java.lang.NoClassDefFoundError: Could not initialize class $line29.$read$ at $line33.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:35) at $line33.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:35) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$10.next(Iterator.scala:312) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:149) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 16/06/01 15:04:21 INFO scheduler.TaskSetManager: Starting task 39.1 in stage 0.0 (TID 40, ip-10-2-2-111.us-west-2.compute.internal, partition 39,PROCESS_LOCAL, 2386 bytes) 16/06/01 15:04:21 WARN scheduler.TaskSetManager: Lost task 19.0 in stage 0.0 (TID 19, ip-10-2-2-203.us-west-2.compute.internal): java.lang.ExceptionInInitializerError at $line29.$read$$iwC.<init>(<console>:7) at $line29.$read.<init>(<console>:24) at $line29.$read$.<init>(<console>:28) at $line29.$read$.<clinit>(<console>) at $line33.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:35) at $line33.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:35) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$10.next(Iterator.scala:312) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:149) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at $line3.$read$$iwC$$iwC.<init>(<console>:15) at $line3.$read$$iwC.<init>(<console>:24) at $line3.$read.<init>(<console>:26) at $line3.$read$.<init>(<console>:30) at $line3.$read$.<clinit>(<console>) ... 18 more On Tue, May 31, 2016 at 8:48 PM Tim Gautier <tim.gaut...@gmail.com> wrote: > That's really odd. I copied that code directly out of the shell and it > errored out on me, several times. I wonder if something I did previously > caused some instability. I'll see if it happens again tomorrow. > > On Tue, May 31, 2016, 8:37 PM Ted Yu <yuzhih...@gmail.com> wrote: > >> Using spark-shell of 1.6.1 : >> >> scala> case class Test(a: Int) >> defined class Test >> >> scala> Seq(1,2).toDS.map(t => Test(t)).show >> +---+ >> | a| >> +---+ >> | 1| >> | 2| >> +---+ >> >> FYI >> >> On Tue, May 31, 2016 at 7:35 PM, Tim Gautier <tim.gaut...@gmail.com> >> wrote: >> >>> 1.6.1 The exception is a null pointer exception. I'll paste the whole >>> thing after I fire my cluster up again tomorrow. >>> >>> I take it by the responses that this is supposed to work? >>> >>> Anyone know when the next version is coming out? I keep running into >>> bugs with 1.6.1 that are hindering my progress. >>> >>> On Tue, May 31, 2016, 8:21 PM Saisai Shao <sai.sai.s...@gmail.com> >>> wrote: >>> >>>> It works fine in my local test, I'm using latest master, maybe this bug >>>> is already fixed. >>>> >>>> On Wed, Jun 1, 2016 at 7:29 AM, Michael Armbrust < >>>> mich...@databricks.com> wrote: >>>> >>>>> Version of Spark? What is the exception? >>>>> >>>>> On Tue, May 31, 2016 at 4:17 PM, Tim Gautier <tim.gaut...@gmail.com> >>>>> wrote: >>>>> >>>>>> How should I go about mapping from say a Dataset[(Int,Int)] to a >>>>>> Dataset[<case class here>]? >>>>>> >>>>>> I tried to use a map, but it throws exceptions: >>>>>> >>>>>> case class Test(a: Int) >>>>>> Seq(1,2).toDS.map(t => Test(t)).show >>>>>> >>>>>> Thanks, >>>>>> Tim >>>>>> >>>>> >>>>> >>>> >>