I think I saw this one already as the first indication that something is wrong and it was related to https://issues.apache.org/jira/browse/SPARK-13516
2016-05-28 1:34 GMT+02:00 Koert Kuipers <ko...@tresata.com>: > it seemed to be related to an Aggregator, so for tests we replaced it with > an ordinary Dataset.reduce operation, and now we got: > > java.lang.NegativeArraySizeException > at > org.apache.spark.unsafe.types.UTF8String.getBytes(UTF8String.java:229) > at > org.apache.spark.unsafe.types.UTF8String.toString(UTF8String.java:821) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown > Source) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:147) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47) > at org.apache.spark.scheduler.Task.run(Task.scala:85) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > i did get the generated code, but its like 17 subtrees and its not a test > but a company real program so i cannot just send it over. > > i will try to create a small test program to reproduce it. > > On Fri, May 27, 2016 at 4:25 PM, Reynold Xin <r...@databricks.com> wrote: > >> They should get printed if you turn on debug level logging. >> >> On Fri, May 27, 2016 at 1:00 PM, Koert Kuipers <ko...@tresata.com> wrote: >> >>> hello all, >>> after getting our unit tests to pass on spark 2.0.0-SNAPSHOT we are now >>> trying to run some algorithms at scale on our cluster. >>> unfortunately this means that when i see errors i am having a harder >>> time boiling it down to a small reproducible example. >>> >>> today we are running an iterative algo using the dataset api and we are >>> seeing tasks fail with errors which seem to related to unsafe operations. >>> the same tasks succeed without issues in our unit tests. >>> >>> i see either: >>> >>> 16/05/27 12:54:46 ERROR executor.Executor: Exception in task 31.0 in >>> stage 21.0 (TID 1073) >>> java.lang.NegativeArraySizeException >>> at >>> org.apache.spark.unsafe.types.UTF8String.getBytes(UTF8String.java:229) >>> at >>> org.apache.spark.unsafe.types.UTF8String.toString(UTF8String.java:821) >>> at >>> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown >>> Source) >>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) >>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) >>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) >>> at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) >>> at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) >>> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) >>> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) >>> at >>> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.sort_addToSorter$(Unknown >>> Source) >>> at >>> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown >>> Source) >>> at >>> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) >>> at >>> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$7$$anon$1.hasNext(WholeStageCodegenExec.scala:359) >>> at >>> org.apache.spark.sql.execution.aggregate.SortBasedAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortBasedAggregateExec.scala:74) >>> at >>> org.apache.spark.sql.execution.aggregate.SortBasedAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortBasedAggregateExec.scala:71) >>> at >>> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:775) >>> at >>> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:775) >>> at >>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) >>> at >>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:318) >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:282) >>> at >>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) >>> at >>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:318) >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:282) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47) >>> at org.apache.spark.scheduler.Task.run(Task.scala:85) >>> at >>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> >>> or alternatively: >>> >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> # SIGSEGV (0xb) at pc=0x00007fe571041cba, pid=2450, tid=140622965913344 >>> # >>> # JRE version: Java(TM) SE Runtime Environment (7.0_75-b13) (build >>> 1.7.0_75-b13) >>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.75-b04 mixed mode >>> linux-amd64 compressed oops) >>> # Problematic frame: >>> # v ~StubRoutines::jbyte_disjoint_arraycopy >>> >>> i assume the best thing would be to try to get it to print out the >>> generated code that is causing this? >>> what switch do i need to use again to do so? >>> thanks, >>> koert >>> >> >> >