gbcoder2020 opened a new issue, #13522: URL: https://github.com/apache/hudi/issues/13522
**Describe the problem you faced** I'm creating a table using INSERT mode with record level index. I see the Spark job is failing with errors as:   **To Reproduce** Steps to reproduce the behavior: ``` data.write .format("hudi") .options(..) .mode("Overwrite") .save(<path>) ``` Hudi options for insert: ``` hoodie.metadata.record.index.max.filegroup.count -> 100000, hoodie.embed.timeline.server -> false, hoodie.parquet.small.file.limit -> 1073741824, hoodie.insert.shuffle.parallelism -> 15800, hoodie.metadata.record.index.enable -> true, path -> <hudi table path>, hoodie.datasource.write.precombine.field -> lut, hoodie.datasource.write.payload.class -> org.apache.hudi.common.model.OverwriteWithLatestAvroPayload, hoodie.metadata.index.column.stats.enable -> true, hoodie.parquet.max.file.size -> 2147483648, hoodie.metadata.enable -> true, hoodie.index.type -> RECORD_INDEX, hoodie.datasource.write.operation -> insert, hoodie.parquet.compression.codec -> snappy, hoodie.datasource.write.recordkey.field -> <id>, hoodie.table.name -> <table_name>, hoodie.datasource.write.table.type -> COPY_ON_WRITE, hoodie.datasource.write.hive_style_partitioning -> true, hoodie.write.markers.type -> DIRECT, hoodie.populate.meta.fields -> true, hoodie.datasource.write.keygenerator.class -> org.apache.hudi.keygen.SimpleKeyGenerator, hoodie.write.lock.provider -> org.apache.hudi.client.transaction.lock.InProcessLockProvider, hoodie.datasource.write.partitionpath.field -> entityType, hoodie.metadata.record.index.min.filegroup.count -> 5000, hoodie.write.concurrency.mode -> SINGLE_WRITER ``` **Expected behavior** Successful insert into hudi table **Environment Description** * Hudi version : 0.15.0 * Spark version : 3.4 * Hive version : NA * Hadoop version : * Storage (HDFS/S3/GCS..) : S3 * Running on Docker? (yes/no) : no **Additional context** Please help understand why this problem may be occurring with details around how the **Stacktrace** ``` Building workload profile: <table> countByKey at HoodieJavaPairRDD.java:105+details RDD: MapPartitionsRDD org.apache.spark.api.java.JavaPairRDD.countByKey(JavaPairRDD.scala:314) org.apache.hudi.data.HoodieJavaPairRDD.countByKey(HoodieJavaPairRDD.java:105) org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.buildProfile(BaseSparkCommitActionExecutor.java:197) org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.execute(BaseSparkCommitActionExecutor.java:168) org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.execute(BaseSparkCommitActionExecutor.java:85) org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:58) org.apache.hudi.table.action.commit.SparkInsertCommitActionExecutor.execute(SparkInsertCommitActionExecutor.java:44) org.apache.hudi.table.HoodieSparkCopyOnWriteTable.insert(HoodieSparkCopyOnWriteTable.java:114) org.apache.hudi.table.HoodieSparkCopyOnWriteTable.insert(HoodieSparkCopyOnWriteTable.java:98) org.apache.hudi.client.SparkRDDWriteClient.insert(SparkRDDWriteClient.java:182) org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:219) org.apache.hudi.HoodieSparkSqlWriterInternal.liftedTree1$1(HoodieSparkSqlWriter.scala:492) org.apache.hudi.HoodieSparkSqlWriterInternal.writeInternal(HoodieSparkSqlWriter.scala:490) org.apache.hudi.HoodieSparkSqlWriterInternal.write(HoodieSparkSqlWriter.scala:187) org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:125) org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:168) org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:47) org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75) org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73) org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84) ``` ``` Job aborted due to stage failure: Task 427 in stage 39.0 failed 4 times, most recent failure: Lost task 427.3 in stage 39.0 (TID 162985) (<ip>executor 853): ExecutorLostFailure (executor 853 exited caused by one of the running tasks) Reason: Container from a bad node: container_1749069924658_0001_01_001944 on host: <ip>. Exit status: 143. Diagnostics: [2025-06-05 00:16:15.459]Container killed on request. Exit code is 143 ``` ``` 3134.968: [GC concurrent-string-deduplication, 944.0B->808.0B(136.0B), avg 63.1%, 0.0000343 secs] # # java.lang.OutOfMemoryError: Java heap space # -XX:OnOutOfMemoryError="kill %p" # Executing /bin/sh -c "kill 15018"... Heap garbage-first heap total 55574528K, used 47555500K [0x00007f2c78000000, 0x00007f2c7880d400, 0x00007f39b8000000) region size 8192K, 59 young (483328K), 0 survivors (0K) Metaspace used 109421K, capacity 115699K, committed 135680K, reserved 137216K ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
