Klring opened a new issue, #8737:
URL: https://github.com/apache/hudi/issues/8737
**Describe the problem you faced**
We use Flink to write log data,the config is multi-writer + cow + cluster
,an error occurred in the cluster after running for a while.
The error indicates that a specific Parquet file does not exist.We have
tried removing the cluster config in another write operation, but the problem
still persists.How can I resolve it? Does Hudi support multi-writer and cluster
operations simultaneously?
**Config**
```'table.type' = 'COPY_ON_WRITE',
'hoodie.datasource.write.partitionpath.field'= 'receive_date',
'hoodie.datasource.write.hive_style_partitioning' = 'true' ,
'write.operation'='insert',
'write.bucket_assign.tasks'='2',
'write.tasks'='4',
'write.task.max.size'='2048',
'write.merge.max_memory'='1024',
'write.batch.size'='256',
'metadata.enabled'='true',
'hoodie.enable.data.skipping'='true',
'clustering.schedule.enabled'='true' ,
'clustering.async.enabled'='true' ,
'hoodie.clustering.plan.strategy.small.file.limit'='629145600',
'hoodie.clustering.plan.strategy.target.file.max.bytes'='1073741824',
'hoodie.write.concurrency.mode'='optimistic_concurrency_control',
'hoodie.cleaner.policy.failed.writes'='LAZY',
'hoodie.write.lock.wait_time_ms'='300000',
'hoodie.write.lock.num_retries'='60',
'hoodie.write.lock.provider'='org.apache.hudi.hive.transaction.lock.HiveMetastoreBasedLockProvider',
'hoodie.write.lock.hivemetastore.uris'='thrift://xxxx:9083' ,
'hoodie.write.lock.hivemetastore.database'='log' ,
'hoodie.write.lock.hivemetastore.table'='log_test',
'clean.async.enabled'='true',
'clean.retain_file_versions'='1' ,
'cleaner.policy'='KEEP_LATEST_FILE_VERSIONS' ,
'clean.retain_commits' = '1' ,
'hoodie.cleaner.delete.bootstrap.base.file'='true',
'hoodie.archive.automatic'='true',
'hoodie.archive.merge.enable'='true',
'hoodie.archive.async'='true',
'hive_sync.enable'='true',
'hive_sync.mode'='hms',
'hive_sync.metastore.uris'='thrift://xxxx:9083',
'hive_sync.db'='log',
'hive_sync.table'='log_test',
'hive_sync.partition_fields'='receive_date',
'hive_sync.support_timestamp'='true'
**Environment Description**
* Hudi version :0.13.0
* Flink version :1.14.4
* Hive version :3.1.2
* Hadoop version :3.2.3
* Storage (HDFS/S3/GCS..) :OBS
* Running on Docker? (yes/no) :no
**Stacktrace**
```Add the stacktrace of the error.```
2023-05-17 11:11:05,232 ERROR
org.apache.hudi.sink.clustering.ClusteringOperator [] - Executor
executes action [Execute clustering for instant 20230517110938069 from task 1]
error
org.apache.hudi.exception.HoodieClusteringException: Error reading input
data for
obs://platform-daka/warehouse/hive/log.db/log_test/receive_date=2023-05-16/bc26fb97-68ba-4634-8d36-a47e3c244f89-0_3-4-0_20230517110750167.parquet
and []
at
org.apache.hudi.sink.clustering.ClusteringOperator.lambda$null$4(ClusteringOperator.java:337)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
at java.lang.Iterable.spliterator(Iterable.java:101) ~[?:1.8.0_333]
at
org.apache.hudi.sink.clustering.ClusteringOperator.lambda$readRecordsForGroupBaseFiles$5(ClusteringOperator.java:341)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
at
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
~[?:1.8.0_333]
at
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
~[?:1.8.0_333]
at
java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
~[?:1.8.0_333]
at
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
~[?:1.8.0_333]
at
java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
~[?:1.8.0_333]
at
java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
~[?:1.8.0_333]
at
java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
~[?:1.8.0_333]
at
org.apache.hudi.sink.clustering.ClusteringOperator.readRecordsForGroupBaseFiles(ClusteringOperator.java:342)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
at
org.apache.hudi.sink.clustering.ClusteringOperator.doClustering(ClusteringOperator.java:242)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
at
org.apache.hudi.sink.clustering.ClusteringOperator.lambda$processElement$0(ClusteringOperator.java:194)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
at
org.apache.hudi.sink.utils.NonThrownExecutor.lambda$wrapAction$0(NonThrownExecutor.java:130)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_333]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_333]
at java.lang.Thread.run(Thread.java:750) [?:1.8.0_333]
Caused by: java.io.FileNotFoundException: getFileStatus on
obs://platform-daka/warehouse/hive/log.db/log_test/receive_date=2023-05-16/bc26fb97-68ba-4634-8d36-a47e3c244f89-0_3-4-0_20230517110750167.parquet:
status [404] - request id [0000018827B09A07930FB5D6E65CDE27] - error code
[null] - error message [Request Error.] - trace
:com.obs.services.exception.ObsException: Error message:Request Error.OBS
servcie Error Message. -- ResponseCode: 404, ResponseStatus: Not Found,
RequestId: 0000018827B09A07930FB5D6E65CDE27, HostId:
32AAAQAAEAABAAAQAAEAABAAAQAAEAABCSe68Sh+FOp7ECYlweelx1ks1zjfo/Qs
at
org.apache.hadoop.fs.obs.OBSCommonUtils.translateException(OBSCommonUtils.java:409)
~[hadoop-huaweicloud-3.2.3-hw-46.jar:?]
at
org.apache.hadoop.fs.obs.OBSCommonUtils.translateException(OBSCommonUtils.java:691)
~[hadoop-huaweicloud-3.2.3-hw-46.jar:?]
at
org.apache.hadoop.fs.obs.OBSPosixBucketUtils.innerFsGetObjectStatus(OBSPosixBucketUtils.java:563)
~[hadoop-huaweicloud-3.2.3-hw-46.jar:?]
at
org.apache.hadoop.fs.obs.OBSFileSystem.innerGetFileStatus(OBSFileSystem.java:1663)
~[hadoop-huaweicloud-3.2.3-hw-46.jar:?]
at
org.apache.hadoop.fs.obs.OBSCommonUtils.innerGetFileStatusWithRetry(OBSCommonUtils.java:1662)
~[hadoop-huaweicloud-3.2.3-hw-46.jar:?]
at
org.apache.hadoop.fs.obs.OBSFileSystem.getFileStatus(OBSFileSystem.java:1631)
~[hadoop-huaweicloud-3.2.3-hw-46.jar:?]
at
org.apache.parquet.hadoop.ParquetReader$Builder.build(ParquetReader.java:337)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
at
org.apache.hudi.io.storage.HoodieAvroParquetReader.getIndexedRecordIteratorInternal(HoodieAvroParquetReader.java:168)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
at
org.apache.hudi.io.storage.HoodieAvroParquetReader.getIndexedRecordIterator(HoodieAvroParquetReader.java:94)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
at
org.apache.hudi.io.storage.HoodieAvroParquetReader.getRecordIterator(HoodieAvroParquetReader.java:73)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
at
org.apache.hudi.sink.clustering.ClusteringOperator.lambda$null$4(ClusteringOperator.java:334)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
... 16 more
Caused by: com.obs.services.exception.ObsException: Error message:Request
Error.OBS servcie Error Message.
at
com.obs.services.internal.utils.ServiceUtils.changeFromServiceException(ServiceUtils.java:620)
~[hadoop-huaweicloud-3.2.3-hw-46.jar:?]
at
com.obs.services.AbstractClient.doActionWithResult(AbstractClient.java:399)
~[hadoop-huaweicloud-3.2.3-hw-46.jar:?]
at
com.obs.services.AbstractObjectClient.getObjectMetadata(AbstractObjectClient.java:583)
~[hadoop-huaweicloud-3.2.3-hw-46.jar:?]
at
com.obs.services.AbstractPFSClient.getAttribute(AbstractPFSClient.java:154)
~[hadoop-huaweicloud-3.2.3-hw-46.jar:?]
at
org.apache.hadoop.fs.obs.OBSPosixBucketUtils.innerFsGetObjectStatus(OBSPosixBucketUtils.java:551)
~[hadoop-huaweicloud-3.2.3-hw-46.jar:?]
at
org.apache.hadoop.fs.obs.OBSFileSystem.innerGetFileStatus(OBSFileSystem.java:1663)
~[hadoop-huaweicloud-3.2.3-hw-46.jar:?]
at
org.apache.hadoop.fs.obs.OBSCommonUtils.innerGetFileStatusWithRetry(OBSCommonUtils.java:1662)
~[hadoop-huaweicloud-3.2.3-hw-46.jar:?]
at
org.apache.hadoop.fs.obs.OBSFileSystem.getFileStatus(OBSFileSystem.java:1631)
~[hadoop-huaweicloud-3.2.3-hw-46.jar:?]
at
org.apache.parquet.hadoop.ParquetReader$Builder.build(ParquetReader.java:337)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
at
org.apache.hudi.io.storage.HoodieAvroParquetReader.getIndexedRecordIteratorInternal(HoodieAvroParquetReader.java:168)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
at
org.apache.hudi.io.storage.HoodieAvroParquetReader.getIndexedRecordIterator(HoodieAvroParquetReader.java:94)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
at
org.apache.hudi.io.storage.HoodieAvroParquetReader.getRecordIterator(HoodieAvroParquetReader.java:73)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
at
org.apache.hudi.sink.clustering.ClusteringOperator.lambda$null$4(ClusteringOperator.java:334)
~[flink_sls_applog-1.5.2-jar-with-dependencies.jar:?]
... 16 more
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]