Thanks for the response.

Seems like a limitation. If resources are available then why bother about
splitting the jobs in smaller durations(performance is not the concern).

This issue is not about the performance optimization but rather the job is
failing with null pointer exception.

Do you have a reason for, in what scenarios UnsafeInMemorySorter will be
Null and as Spark is throwing this error so how can the client handle it or
rather prevent it ?

Regards,
Ashwani

On Wed, Nov 13, 2024 at 10:10 AM Gurunandan <gurunandan....@gmail.com>
wrote:

> You should be able to split large job into more manageable jobs based
> on stages using checkpoint. if a job fails, Job can be restarted from
> the latest checkpoint, saving time and resources, thus xheckpoints can
> be used as recovery points.
> Smaller stages can be optimized independently, leading to better
> resource utilization and faster execution.
>
> regards,
> Guru
>
> On Tue, Nov 12, 2024 at 8:54 AM Ashwani Pundir
> <ashwani.pund...@gmail.com> wrote:
> >
> > [TYPO CORRECTION]
> > "I am manually un-persisting various RDDs progressively and that might
> have something to do with this error. "
> >
> > On Tue, Nov 12, 2024 at 8:51 AM Ashwani Pundir <
> ashwani.pund...@gmail.com> wrote:
> >>
> >> Hi Guru,
> >>
> >> Many thanks for taking out the time and looking into this issue. I
> really appreciate it.
> >>
> >> To your suggestion, I have already checked and It seems like this
> problem is not caused by the nulls in input data. The reason being, the
> issue is only happening when the 'job processing duration' is >= 10
> months(for some reason which I am not able to figure out). When the job is
> broken down into smaller time duration for the same input data, it
> completes successfully.
> >> To me, it seems like it has something to do with DISK_ONLY persistence
> that I am using in my job. I am manually persisting RDD progressively and
> that might have something to do with this error. But as per my
> understanding, even if the blocks are not available after the un-persists,
> RDD should recompute and not throw NullPointerException.
> >>
> >> Looking forward to some guidance on how I should proceed further.
> >>
> >> Regards,
> >> Ashwani
> >>
> >>
> >>
> >> On Mon, Nov 11, 2024 at 7:23 PM Gurunandan <gurunandan....@gmail.com>
> wrote:
> >>>
> >>> Hi Ashwani,
> >>> Please verify input data by ensuring that the data being processed is
> >>> valid and free of null values or unexpected data types.
> >>> if data undergoes complex transformations before sorting review the
> >>> data Transformations, verify that data transformations don't introduce
> >>> inconsistencies or null values.
> >>>
> >>> regards,
> >>> Guru
> >>>
> >>> On Mon, Nov 11, 2024 at 6:06 PM Ashwani Pundir
> >>> <ashwani.pund...@gmail.com> wrote:
> >>> >
> >>> > Dear Spark Devs,
> >>> >
> >>> > I am reaching out because I am struggling to find the root cause of
> the below exception
> >>> > (I have spent almost 4 days trying to figure out the root cause of
> this issue. I tried to search on various forums including StackOverFlow but
> did not find anyone reported this kind of error)
> >>> > -----------------------------------------------------------------
> >>> > 2024-11-11 13:56:43,932  [Executor task launch worker for task 0.0
> in stage 28.0 (TID 1007)] DEBUG - o.a.s.s.e.c. Compressor for
> [AggAllLevel]:
> org.apache.spark.sql.execution.columnar.compression.PassThrough$Encoder@766c1b92,
> ratio: 1.0
> >>> > 2024-11-11 13:56:43,936  [Executor task launch worker for task 0.0
> in stage 28.0 (TID 1007)] WARN  - o.a.s.s.BlockManager Putting block
> rdd_66_0 failed due to exception java.lang.NullPointerException: Cannot
> invoke
> "org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.numRecords()"
> because "this.inMemSorter" is null.
> >>> > 2024-11-11 13:56:43,937  [Executor task launch worker for task 0.0
> in stage 28.0 (TID 1007)] WARN  - o.a.s.s.BlockManager Block rdd_66_0 could
> not be removed as it was not found on disk or in memory
> >>> > 2024-11-11 13:56:43,937  [dispatcher-BlockManagerMaster] DEBUG -
> o.a.s.s.BlockManagerMasterEndpoint Updating block info on master rdd_66_0
> for BlockManagerId(driver, ibpri011662.emeter.com, 17623, None)
> >>> > 2024-11-11 13:56:43,937  [Executor task launch worker for task 0.0
> in stage 28.0 (TID 1007)] DEBUG - o.a.s.s.BlockManagerMaster Updated info
> of block rdd_66_0
> >>> > 2024-11-11 13:56:43,937  [Executor task launch worker for task 0.0
> in stage 28.0 (TID 1007)] DEBUG - o.a.s.s.BlockManager Told master about
> block rdd_66_0
> >>> > 2024-11-11 13:56:43,941  [Executor task launch worker for task 0.0
> in stage 28.0 (TID 1007)] ERROR - o.a.s.e.Executor Exception in task 0.0 in
> stage 28.0 (TID 1007)
> >>> > java.lang.NullPointerException: Cannot invoke
> "org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.numRecords()"
> because "this.inMemSorter" is null
> >>> > at
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.insertRecord(UnsafeExternalSorter.java:478)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.execution.UnsafeExternalRowSorter.insertRow(UnsafeExternalRowSorter.java:138)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage4.sort_addToSorter_0$(Unknown
> Source) ~[?:?]
> >>> > at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage4.processNext(UDoubleColumnBuildernknown
> Source) ~[?:?]
> >>> > at
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage8.smj_findNextJoinRows_0$(Unknown
> Source) ~[?:?]
> >>> > at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage8.processNext(Unknown
> Source) ~[?:?]
> >>> > at
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer$$anon$1.hasNext(InMemoryRelation.scala:119)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.execution.columnar.CachedRDDBuilder$$anon$2.hasNext(InMemoryRelation.scala:288)
> ~[em-spark-assembly.jar!/:?]
> >>> > at scala.collection.Iterator$$anon$6.hasNext(Iterator.scala:477)
> ~[em-spark-assembly.jar!/:?]
> >>> > at scala.collection.Iterator$$anon$9.hasNext(Iterator.scala:583)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificColumnarIterator.hasNext(Unknown
> Source) ~[?:?]
> >>> > at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
> Source) ~[?:?]
> >>> > at
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.smj_findNextJoinRows_0$(Unknown
> Source) ~[?:?]
> >>> > at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext(Unknown
> Source) ~[?:?]
> >>> > at
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer$$anon$1.hasNext(InMemoryRelation.scala:119)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.execution.columnar.CachedRDDBuilder$$anon$2.hasNext(InMemoryRelation.scala:288)
> ~[em-spark-assembly.jar!/:?]
> >>> > at scala.collection.Iterator$$anon$6.hasNext(Iterator.scala:477)
> ~[em-spark-assembly.jar!/:?]
> >>> > at scala.collection.Iterator$$anon$9.hasNext(Iterator.scala:583)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificColumnarIterator.hasNext(Unknown
> Source) ~[?:?]
> >>> > at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
> Source) ~[?:?]
> >>> > at
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.smj_findNextJoinRows_0$(Unknown
> Source) ~[?:?]
> >>> > at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext(Unknown
> Source) ~[?:?]
> >>> > at
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer$$anon$1.hasNext(InMemoryRelation.scala:119)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.execution.columnar.CachedRDDBuilder$$anon$2.hasNext(InMemoryRelation.scala:288)
> ~[em-spark-assembly.jar!/:?]
> >>> > at scala.collection.Iterator$$anon$6.hasNext(Iterator.scala:477)
> ~[em-spark-assembly.jar!/:?]
> >>> > at scala.collection.Iterator$$anon$9.hasNext(Iterator.scala:583)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificColumnarIterator.hasNext(Unknown
> Source) ~[?:?]
> >>> > at scala.collection.Iterator$$anon$9.hasNext(Iterator.scala:583)
> ~[em-spark-assembly.jar!/:?]
> >>> > at scala.collection.Iterator$$anon$9.hasNext(Iterator.scala:583)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:139)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.serializer.SerializerManager.dataSerializeStream(SerializerManager.scala:177)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$6(BlockManager.scala:1635)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$6$adapted(BlockManager.scala:1633)
> ~[em-spark-assembly.jar!/:?]
> >>> > at org.apache.spark.storage.DiskStore.put(DiskStore.scala:88)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1633)
> ~[em-spark-assembly.jar!/:?]
> >>> > at 
> >>> > org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1524)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1588)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1389)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.storage.BlockManager.getOrElseUpdateRDDBlock(BlockManager.scala:1343)
> ~[em-spark-assembly.jar!/:?]
> >>> > at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:379)
> ~[em-spark-assembly.jar!/:?]
> >>> > at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> ~[em-spark-assembly.jar!/:?]
> >>> > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
> ~[em-spark-assembly.jar!/:?]
> >>> > at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:104)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
> ~[em-spark-assembly.jar!/:?]
> >>> > at org.apache.spark.scheduler.Task.run(Task.scala:141)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
> ~[em-spark-assembly.jar!/:?]
> >>> > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
> ~[em-spark-assembly.jar!/:?]
> >>> > at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
> [em-spark-assembly.jar!/:?]
> >>> > at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> [?:?]
> >>> > at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> [?:?]
> >>> > at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
> >>> > --------------------------------------------------------
> >>> > Need your help in understanding when this issue can occur.
> >>> > Please let me know if any further information will be needed from my
> side.(Attaching full logs for the reference)
> >>> >
> >>> > PS; Spark is running in local mode.
> >>> >
> >>> > Regards,
> >>> > Ashwani
> >>> >
> >>> > ---------------------------------------------------------------------
> >>> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>

Reply via email to