jiangbiao910 opened a new issue, #6462:
URL: https://github.com/apache/hudi/issues/6462

   Environment Description
   Hudi version : 0.11.0
   Flink version : 1.13.1
   Hive version : 2.1.1-cdh6.2.0
   Hadoop version : 3.0.0-cdh6.2.0
   Storage (HDFS/S3/GCS..) : HDFS
   Running on Docker? (yes/no) : no
   
   when I use execute Spark SQL:
   SET TBLPROPERTIES ("hoodie.metadata.enable"="false") 
   `INSERT OVERWRITE TABLE zone_dw.dws_stat_day_di PARTITION(dt) 
   select * from zone_dw.dws_map_refresh_hi where dt BETWEEN '20220812' AND 
'20220819' ` 
   zone_dw.dws_stat_day_di is Hive Table  and  zone_dw.dws_map_refresh_hi is  
Hudi COW table
   
   the error log :
   
   22/08/20 02:56:58 ERROR client.RemoteDriver: Failed to run client job 
39d720db-b15d-4823-b8b1-54398b143d6e
   org.apache.hudi.exception.HoodieException: Error fetching partition paths 
from metadata table
        at 
org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:315)
        at 
org.apache.hudi.BaseHoodieTableFileIndex.getAllQueryPartitionPaths(BaseHoodieTableFileIndex.java:176)
        at 
org.apache.hudi.BaseHoodieTableFileIndex.loadPartitionPathFiles(BaseHoodieTableFileIndex.java:219)
        at 
org.apache.hudi.BaseHoodieTableFileIndex.doRefresh(BaseHoodieTableFileIndex.java:264)
        at 
org.apache.hudi.BaseHoodieTableFileIndex.<init>(BaseHoodieTableFileIndex.java:139)
        at 
org.apache.hudi.hadoop.HiveHoodieTableFileIndex.<init>(HiveHoodieTableFileIndex.java:49)
        at 
org.apache.hudi.hadoop.HoodieCopyOnWriteTableInputFormat.listStatusForSnapshotMode(HoodieCopyOnWriteTableInputFormat.java:234)
        at 
org.apache.hudi.hadoop.HoodieCopyOnWriteTableInputFormat.listStatus(HoodieCopyOnWriteTableInputFormat.java:141)
        at 
org.apache.hudi.hadoop.HoodieParquetInputFormatBase.listStatus(HoodieParquetInputFormatBase.java:90)
        at 
org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat$HoodieCombineFileInputFormatShim.listStatus(HoodieCombineHiveInputFormat.java:889)
        at 
org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:217)
        at 
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:76)
        at 
org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat$HoodieCombineFileInputFormatShim.getSplits(HoodieCombineHiveInputFormat.java:942)
        at 
org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat.getCombineSplits(HoodieCombineHiveInputFormat.java:241)
        at 
org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat.getSplits(HoodieCombineHiveInputFormat.java:363)
        at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:205)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
        at org.apache.spark.rdd.RDD.getNumPartitions(RDD.scala:267)
        at 
org.apache.spark.api.java.JavaRDDLike$class.getNumPartitions(JavaRDDLike.scala:65)
        at 
org.apache.spark.api.java.AbstractJavaRDDLike.getNumPartitions(JavaRDDLike.scala:45)
        at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateMapInput(SparkPlanGenerator.java:252)
        at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:179)
        at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:130)
        at 
org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:355)
        at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:400)
        at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:365)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.hudi.exception.HoodieMetadataException: Failed to 
retrieve list of partition from metadata
        at 
org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:113)
        at 
org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:313)
        ... 32 more
   Caused by: java.util.NoSuchElementException: No value present in Option
        at org.apache.hudi.common.util.Option.get(Option.java:89)
        at 
org.apache.hudi.metadata.HoodieTableMetadataUtil.getPartitionFileSlices(HoodieTableMetadataUtil.java:1057)
        at 
org.apache.hudi.metadata.HoodieTableMetadataUtil.getPartitionLatestMergedFileSlices(HoodieTableMetadataUtil.java:1001)
        at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.getPartitionFileSliceToKeysMapping(HoodieBackedTableMetadata.java:377)
        at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:204)
        at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:140)
        at 
org.apache.hudi.metadata.BaseTableMetadata.fetchAllPartitionPaths(BaseTableMetadata.java:281)
        at 
org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:111)
        ... 33 more
   22/08/20 02:56:59 INFO client.RemoteDriver: Shutting down Spark Remote 
Driver.
   22/08/20 02:56:59 INFO server.AbstractConnector: Stopped 
Spark@ce7a81b{HTTP/1.1,[http/1.1]}{0.0.0.0:0}
   22/08/20 02:56:59 INFO ui.SparkUI: Stopped Spark web UI at 
http://scsp04097:34219
   22/08/20 02:56:59 INFO yarn.YarnAllocator: Driver requested a total number 
of 0 executor(s).
   22/08/20 02:56:59 INFO cluster.YarnClusterSchedulerBackend: Shutting down 
all executors
   22/08/20 02:56:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: 
Asking each executor to shut down
   22/08/20 02:56:59 INFO cluster.SchedulerExtensionServices: Stopping 
SchedulerExtensionServices
   (serviceOption=None,
    services=List(),
    started=false)
   22/08/20 02:56:59 INFO spark.MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   22/08/20 02:56:59 INFO memory.MemoryStore: MemoryStore cleared
   
   
   **But when I run the sql again, it work well;** 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to