AshinGau opened a new pull request, #22187: URL: https://github.com/apache/doris/pull/22187
## Proposed changes Fix errors when analyzing hudi tables or reading hudi partition columns: The `required_fields` is an empty string when only reading partition columns, so errors are thrown when parsing column types. ``` W0724 16:44:38.431789 296139 jni-util.cpp:239] java.util.NoSuchElementException: key not found: at scala.collection.MapLike.default(MapLike.scala:236) at scala.collection.MapLike.default$(MapLike.scala:235) at scala.collection.AbstractMap.default(Map.scala:65) at scala.collection.MapLike.apply(MapLike.scala:144) at scala.collection.MapLike.apply$(MapLike.scala:143) at scala.collection.AbstractMap.apply(Map.scala:65) at org.apache.doris.hudi.HoodieSplit.$anonfun$requiredTypes$1(BaseSplitReader.scala:89) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198) at org.apache.doris.hudi.HoodieSplit.<init>(BaseSplitReader.scala:88) at org.apache.doris.hudi.HudiJniScanner.<init>(HudiJniScanner.java:66) ``` The `HudiJniScanner.<init>` is failed, and the `_jni_scanner_obj` is not initialized, so we can't call methods in `_jni_scanner_obj`. ``` 5# JNI_ArgumentPusherVaArg::JNI_ArgumentPusherVaArg(_jmethodID*, __va_list_tag*) in /mnt/datadisk0/gaoxin/app/jdk1.8.0_131/jre/lib/amd64/server/[libjvm.so](http://libjvm.so/) 6# jni_CallObjectMethodV in /mnt/datadisk0/gaoxin/app/jdk1.8.0_131/jre/lib/amd64/server/[libjvm.so](http://libjvm.so/) 7# JNIEnv_::CallObjectMethod(_jobject*, _jmethodID*, ...) in /mnt/datadisk0/gaoxin/doris/output/be/lib/doris_be 8# doris::vectorized::JniConnector::close() in /mnt/datadisk0/gaoxin/doris/output/be/lib/doris_be 9# doris::vectorized::JniConnector::~JniConnector() in /mnt/datadisk0/gaoxin/doris/output/be/lib/doris_be 10# doris::vectorized::HudiJniReader::~HudiJniReader() in /mnt/datadisk0/gaoxin/doris/output/be/lib/doris_be 11# doris::vectorized::VFileScanner::~VFileScanner() in /mnt/datadisk0/gaoxin/doris/output/be/lib/doris_be 12# doris::vectorized::VFileScanner::~VFileScanner() in /mnt/datadisk0/gaoxin/doris/output/be/lib/doris_be 13# doris::vectorized::ScannerContext::_close_and_clear_scanners(doris::vectorized::VScanNode*, doris::RuntimeState*) in /mnt/datadisk0/gaoxin/doris/output/be/lib/doris_be ``` ## How to fix 1. If only read the partition columns, the `JniConnector` will produce empty required fields, so `HudiJniScanner` should read the "_hoodie_record_key" field at least to know how many rows in current hoodie split. Even if the `JniConnector` doesn't read this field, the call of `releaseTable` in `JniConnector` will reclaim the resource. 2. To prevent BE failure and exit, `JniConnector` can only call release method after `HudiJniScanner` is initialized. It should be noted that `VectorTable` is created lazily in `JniScanner`, so we don't need to reclaim the resource when `HudiJniScanner` is failed to initialize. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org