Wenning Ding created HUDI-4382:
----------------------------------
Summary: Add logger for HoodieCopyOnWriteTableInputFormat
Key: HUDI-4382
URL: https://issues.apache.org/jira/browse/HUDI-4382
Project: Apache Hudi
Issue Type: Bug
Reporter: Wenning Ding
When querying the ro and rt bootstrap mor tables using presto I observed both
are failed with the following excecption:
{{java.lang.NoSuchFieldError: LOG
at
org.apache.hudi.hadoop.HoodieCopyOnWriteTableInputFormat.makeExternalFileSplit(HoodieCopyOnWriteTableInputFormat.java:199)
at
org.apache.hudi.hadoop.HoodieCopyOnWriteTableInputFormat.makeSplit(HoodieCopyOnWriteTableInputFormat.java:100)
at
org.apache.hudi.hadoop.realtime.HoodieMergeOnReadTableInputFormat.doMakeSplitForRealtimePath(HoodieMergeOnReadTableInputFormat.java:266)
at
org.apache.hudi.hadoop.realtime.HoodieMergeOnReadTableInputFormat.makeSplit(HoodieMergeOnReadTableInputFormat.java:211)
at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:345)
at
org.apache.hudi.hadoop.realtime.HoodieMergeOnReadTableInputFormat.getSplits(HoodieMergeOnReadTableInputFormat.java:79)
at
org.apache.hudi.hadoop.HoodieParquetInputFormatBase.getSplits(HoodieParquetInputFormatBase.java:68)
at
com.facebook.presto.hive.StoragePartitionLoader.loadPartition(StoragePartitionLoader.java:278)
at
com.facebook.presto.hive.DelegatingPartitionLoader.loadPartition(DelegatingPartitionLoader.java:81)
at
com.facebook.presto.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:224)
at
com.facebook.presto.hive.BackgroundHiveSplitLoader.access$700(BackgroundHiveSplitLoader.java:50)
at
com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:153)
at
com.facebook.presto.hive.util.ResumableTasks.safeProcessTask(ResumableTasks.java:47)
at
com.facebook.presto.hive.util.ResumableTasks.access$000(ResumableTasks.java:20)
at
com.facebook.presto.hive.util.ResumableTasks$1.run(ResumableTasks.java:35)
at
com.facebook.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)}}
The reason we saw {{java.lang.NoSuchFieldError: LOG}} during the presto query
is because in this {{HoodieCopyOnWriteTableInputFormat}} class, it inherits
field {{LOG}} from its parent class {{FileInputFormat}} which is a class from
Hadoop.
So in the compile time, it would reference this field from
{{{}FileInputFormat.class{}}}. However, in the runtime, the presto doesn't have
all the Hadoop classes in its classpath, what Presto uses is its own Hadoop
dependency e.g. {{{}hadoop-apache2:jar{}}}. I checked that {{hadoop-apache2}}
does not have class {{FileInputFormat}} shaded which causes this runtime error.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)