keerthiskating opened a new issue, #11712: URL: https://github.com/apache/hudi/issues/11712
**Describe the problem you faced** My Hudi job runs fine for first 9-10 executions. The job run after hangs and neither succeeds or fails. I am running this on Glue 4.0, Hudi 0.14. I have gone through the Spark UI and looks like the job is hanging on `Preparing compaction metadata: gft_fact_consol_hudi_metadata` step. <img width="1485" alt="Screenshot 2024-07-31 at 1 07 15 PM" src="https://github.com/user-attachments/assets/f885dc1b-6b18-4afa-93ac-95a25edca287"> **To Reproduce** Steps to reproduce the behavior: Below are the hudi options used ``` { 'hoodie.table.cdc.enabled':'true', 'hoodie.table.cdc.supplemental.logging.mode': 'data_before_after', 'hoodie.datasource.write.recordkey.field': 'bazaar_uuid', 'hoodie.datasource.write.keygenerator.class': 'org.apache.hudi.keygen.ComplexKeyGenerator', 'hoodie.table.name': "gft_fact_consol_hudi", 'hoodie.datasource.write.table.name': "gft_fact_consol_hudi", 'hoodie.datasource.hive_sync.table': "gft_fact_consol_hudi", 'hoodie.datasource.hive_sync.database': "default", 'hoodie.datasource.write.partitionpath.field': 'a,b,c', 'hoodie.datasource.hive_sync.partition_fields': 'a,b,c', 'hoodie.datasource.write.hive_style_partitioning': 'true', 'hoodie.datasource.hive_sync.enable': 'true', 'hoodie.datasource.hive_sync.partition_extractor_class': 'org.apache.hudi.hive.MultiPartKeysValueExtractor', 'hoodie.metadata.enable': 'true', 'hoodie.metadata.record.index.enable':'true', 'hoodie.cleaner.policy': 'KEEP_LATEST_FILE_VERSIONS', # 'hoodie.parquet.small.file.limit':104857600, # 'hoodie.parquet.max.file.size':125829120, 'hoodie.clustering.inline':'true', 'hoodie.clustering.inline.max.commits': '4', 'hoodie.datasource.write.storage.type': 'COPY_ON_WRITE', 'hoodie.datasource.write.operation': 'upsert', 'hoodie.datasource.write.precombine.field': 'record_uuid', 'hoodie.datasource.hive_sync.use_jdbc': 'false', 'hoodie.datasource.hive_sync.mode': 'hms', 'hoodie.datasource.hive_sync.support_timestamp': 'true', # 'hoodie.write.concurrency.mode': 'OPTIMISTIC_CONCURRENCY_CONTROL', # 'hoodie.write.lock.provider': 'org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider', # 'hoodie.cleaner.policy.failed.writes': 'LAZY', # 'hoodie.write.lock.dynamodb.table': 'fri_hudi_locks_table', # 'hoodie.embed.timeline.server': 'false', # 'hoodie.write.lock.client.wait_time_ms_between_retry': 50000, # 'hoodie.write.lock.wait_time_ms_between_retry': 20000, # 'hoodie.write.lock.wait_time_ms': 60000, # 'hoodie.write.lock.client.num_retries': 15, # 'hoodie.keep.max.commits':'7', # 'hoodie.keep.min.commits':'6', # 'hoodie.write.lock.dynamodb.region': 'us-west-2', # 'hoodie.write.lock.dynamodb.endpoint_url': 'dynamodb.us-west-2.amazonaws.com' } ``` **Expected behavior** As per https://hudi.apache.org/docs/compaction#background, compaction should only occur for MOR tables. Any idea why it is happening for a COW table? **Environment Description** * Hudi version : 0.14 * Spark version : 3.3.0 * Hive version : * Hadoop version : * Storage (HDFS/S3/GCS..) : s3 * Running on Docker? (yes/no) : -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
