dataproblems opened a new issue, #11997:
URL: https://github.com/apache/hudi/issues/11997

   **Describe the problem you faced**
   
   Writes to a Hudi Table in S3 fail due to a FileNotFoundException on the 
archived folder under the `.hoodie` directory. I've verified that the file does 
exist in S3 and am puzzled as to what might be causing this issue. The 
structured streaming query fails after the 17th batch. 
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   1. Create the necessary base table using a bulk insert
   2. Start consuming from kinesis and upsert to the base table
   3. Eventually after about 17 batches, this exception happens
   
   **Expected behavior**
   
   I'm expecting no issues like this to happen and the hudi table to be updated 
with the new data. I tried removing the write to hudi and simply wrote the 
batch output to S3, in that case my spark job continues to run well beyond 
batch 17. 
   
   **Environment Description**
   
   * Hudi version : 1.0.0-beta1
   
   * Spark version : 3.3.2 
   
   * Hive version :  3.1.3
   
   * Hadoop version : 3.3.3
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : No
   
   
   **Additional context**
   
   We are not using any `async` services and this is a single writer updating 
the hudi table. 
   
   Hudi Upsert Configuration: 
   
   ```
   val UpsertOptions: Map[String, String] = Map(
       DataSourceWriteOptions.OPERATION.key() -> 
DataSourceWriteOptions.UPSERT_OPERATION_OPT_VAL, 
       DataSourceWriteOptions.TABLE_TYPE.key() -> 
DataSourceWriteOptions.COW_TABLE_TYPE_OPT_VAL, 
       HoodieStorageConfig.PARQUET_COMPRESSION_CODEC_NAME.key() -> "snappy", 
       HoodieStorageConfig.PARQUET_MAX_FILE_SIZE
         .key() -> "2147483648", 
       "hoodie.parquet.small.file.limit" -> "1073741824",
       "hoodie.upsert.shuffle.parallelism" -> "5",
       HoodieMetadataConfig.ENABLE_METADATA_INDEX_COLUMN_STATS.key() -> "true",
       HoodieIndexConfig.INDEX_TYPE.key() -> "RECORD_INDEX",
       "hoodie.metadata.enable" -> "true",
       "hoodie.datasource.write.hive_style_partitioning" -> "true",
       "hoodie.cleaner.policy" -> "KEEP_LATEST_COMMITS",
       "hoodie.cleaner.commits.retained" -> "10",
       "hoodie.metadata.record.index.enable" -> "true" 
     )
   ```
   
   
   **Stacktrace**
   
   ```
   Caused by: org.apache.hudi.exception.HoodieException: Failed to instantiate 
Metadata table
           at 
org.apache.hudi.client.SparkRDDWriteClient.initializeMetadataTable(SparkRDDWriteClient.java:293)
           at 
org.apache.hudi.client.SparkRDDWriteClient.initMetadataTable(SparkRDDWriteClient.java:273)
           at 
org.apache.hudi.client.BaseHoodieWriteClient.doInitTable(BaseHoodieWriteClient.java:1250)
           at 
org.apache.hudi.client.BaseHoodieWriteClient.initTable(BaseHoodieWriteClient.java:1290)
           at 
org.apache.hudi.client.SparkRDDWriteClient.upsert(SparkRDDWriteClient.java:139)
           at 
org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:224)
           at 
org.apache.hudi.HoodieSparkSqlWriterInternal.writeInternal(HoodieSparkSqlWriter.scala:506)
           at 
org.apache.hudi.HoodieSparkSqlWriterInternal.write(HoodieSparkSqlWriter.scala:196)
           at 
org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:121)
           at 
org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:144)
           at 
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:47)
           at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
           at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
           at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
           at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:104)
           at 
org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
           at 
org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:224)
           at 
org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:114)
           at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$7(SQLExecution.scala:139)
           at 
org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
           at 
org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:224)
           at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:139)
           at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:245)
           at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:138)
           at 
org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
           at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
           at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:101)
           at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:97)
           at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:626)
           at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:179)
           at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:626)
           at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:31)
           at 
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
           at 
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
           at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31)
           at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31)
           at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:602)
           at 
org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:97)
           at 
org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:84)
           at 
org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:82)
           at 
org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:125)
           at 
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:860)
           at 
**org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:390)**
   Caused by: org.apache.hudi.exception.HoodieCommitException: Failed to write 
commits
           at 
org.apache.hudi.client.timeline.LSMTimelineWriter.write(LSMTimelineWriter.java:120)
           at 
org.apache.hudi.client.timeline.HoodieTimelineArchiver.archiveIfRequired(HoodieTimelineArchiver.java:112)
           at 
org.apache.hudi.client.BaseHoodieTableServiceClient.archive(BaseHoodieTableServiceClient.java:788)
           at 
org.apache.hudi.client.BaseHoodieWriteClient.archive(BaseHoodieWriteClient.java:885)
           at 
org.apache.hudi.client.BaseHoodieWriteClient.archive(BaseHoodieWriteClient.java:895)
           at 
org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.performTableServices(HoodieBackedTableMetadataWriter.java:1325)
           at 
org.apache.hudi.client.SparkRDDWriteClient.initializeMetadataTable(SparkRDDWriteClient.java:290)
           ... 77 more
   Caused by: java.io.FileNotFoundException: No such file or directory 
's3://some-bucket/some-prefix/table-name/.hoodie/metadata/.hoodie/archived/00000000000000010_00000000000000012_0.parquet'
           at 
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.getFileStatus(S3NativeFileSystem.java:529)
           at 
com.amazon.ws.emr.hadoop.fs.EmrFileSystem.getFileStatus(EmrFileSystem.java:617)
           at 
org.apache.hudi.common.fs.HoodieWrapperFileSystem.lambda$getFileStatus$17(HoodieWrapperFileSystem.java:410)
           at 
org.apache.hudi.common.fs.HoodieWrapperFileSystem.executeFuncWithTimeMetrics(HoodieWrapperFileSystem.java:114)
           at 
org.apache.hudi.common.fs.HoodieWrapperFileSystem.getFileStatus(HoodieWrapperFileSystem.java:404)
           at 
org.apache.hudi.client.timeline.LSMTimelineWriter.getFileEntry(LSMTimelineWriter.java:309)
           at 
org.apache.hudi.client.timeline.LSMTimelineWriter.updateManifest(LSMTimelineWriter.java:158)
           at 
org.apache.hudi.client.timeline.LSMTimelineWriter.updateManifest(LSMTimelineWriter.java:137)
           at 
org.apache.hudi.client.timeline.LSMTimelineWriter.write(LSMTimelineWriter.java:118)
           ... 83 more
   24/09/23 22:33:51 INFO SparkContext: Invoking stop() from shutdown hook
   24/09/23 22:33:51 INFO SparkUI: Stopped Spark web UI at 
http://ip-10-0-171-12.ec2.internal:4040
   24/09/23 22:33:51 INFO YarnClientSchedulerBackend: Interrupting monitor 
thread
   24/09/23 22:33:51 INFO YarnClientSchedulerBackend: Shutting down all 
executors
   24/09/23 22:33:51 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each 
executor to shut down
   24/09/23 22:33:51 INFO YarnClientSchedulerBackend: YARN client scheduler 
backend Stopped
   24/09/23 22:33:51 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   24/09/23 22:33:51 INFO MemoryStore: MemoryStore cleared
   24/09/23 22:33:51 INFO BlockManager: BlockManager stopped
   24/09/23 22:33:51 INFO BlockManagerMaster: BlockManagerMaster stopped
   24/09/23 22:33:51 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   24/09/23 22:33:51 INFO SparkContext: Successfully stopped SparkContext
   24/09/23 22:33:51 INFO ShutdownHookManager: Shutdown hook called
   24/09/23 22:33:51 INFO ShutdownHookManager: Deleting directory 
/mnt/tmp/spark-d3efe56e-91fa-4bf9-930b-78ac8c76ff79
   24/09/23 22:33:51 INFO ShutdownHookManager: Deleting directory 
/mnt/tmp/spark-adf36161-48cc-4f63-922d-d6f16b5d9be4
   Command exiting with ret '1'
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to