[ https://issues.apache.org/jira/browse/HUDI-3642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sagar Sumit updated HUDI-3642: ------------------------------ Status: Patch Available (was: In Progress) > NullPointerException during multi-writer conflict resolution > ------------------------------------------------------------ > > Key: HUDI-3642 > URL: https://issues.apache.org/jira/browse/HUDI-3642 > Project: Apache Hudi > Issue Type: Bug > Reporter: Ethan Guo > Assignee: Sagar Sumit > Priority: Blocker > Labels: pull-request-available > Fix For: 0.11.0 > > > Scenario: multi-writer test, one writer doing ingesting with Deltastreamer > continuous mode, COW, inserts, async clustering and cleaning (partitions > under 2022/1, 2022/2), another writer with Spark datasource doing backfills > to different partitions (2021/12). > 0.10.0 no MT, clustering instant is inflight (failing it in the middle before > upgrade) ➝ 0.11 MT, with multi-writer configuration the same as before. > For 0.10.0, we hit this NPE from backfill job. Need to see if this can > happen for latest master. > {code:java} > java.lang.NullPointerException > at > org.apache.hudi.client.transaction.ConcurrentOperation.init(ConcurrentOperation.java:121) > at > org.apache.hudi.client.transaction.ConcurrentOperation.<init>(ConcurrentOperation.java:61) > at > org.apache.hudi.client.utils.TransactionUtils.lambda$resolveWriteConflictIfAny$0(TransactionUtils.java:69) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384) > at > java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743) > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647) > at > org.apache.hudi.client.utils.TransactionUtils.resolveWriteConflictIfAny(TransactionUtils.java:67) > at > org.apache.hudi.client.SparkRDDWriteClient.preCommit(SparkRDDWriteClient.java:501) > at > org.apache.hudi.client.AbstractHoodieWriteClient.commitStats(AbstractHoodieWriteClient.java:195) > at > org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:124) > at > org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:633) > at > org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:284) > at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164) > at > org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90) > at > org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180) > at > org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:132) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:131) > at > org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989) > at > org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103) > at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163) > at > org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90) > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) > at > org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989) > at > org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438) > at > org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:415) > at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:293) > at $anonfun$res0$1(backfill_before.scala:57) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) > ... 60 elided {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)