[ https://issues.apache.org/jira/browse/KUDU-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933599#comment-17933599 ]
Quanlong Huang commented on KUDU-3649: -------------------------------------- Uploaded a fix for review: https://gerrit.cloudera.org/c/22602/ > Last processed Hive Metastore notification event ID is not loaded correctly > --------------------------------------------------------------------------- > > Key: KUDU-3649 > URL: https://issues.apache.org/jira/browse/KUDU-3649 > Project: Kudu > Issue Type: Bug > Components: server > Reporter: Quanlong Huang > Priority: Critical > Attachments: kudu-debug.patch, > kudu-master.quanlong-OptiPlex-BJ.quanlong.log.INFO.20250309-094427.22679 > > > While launching kudu-master with Hive Metastore integration enabled, I don't > see the following log: > {code:cpp} > if (hms_catalog_) { > static const char* const kNotificationLogEventIdDescription = > "Loading latest processed Hive Metastore notification log event ID"; > LOG(INFO) << kNotificationLogEventIdDescription << "...";{code} > https://github.com/apache/kudu/blob/e742f86f6d8e687dd02d9891f33e068477163016/src/kudu/master/catalog_manager.cc#L1446-L1447 > The kudu version used in Impala's test env is commit e742f86f6. I tried > patching kudu to add more debug logs by [^kudu-debug.patch]. Then I see the > following logs: > {noformat} > I20250309 09:44:28.930799 22774 catalog_manager.cc:1434] Initializing > in-progress tserver states... > I20250309 09:44:28.930833 22774 catalog_manager.cc:1455] hms_catalog_ is > nullptr > I20250309 09:44:28.930859 22765 hms_catalog.cc:109] Initializing HmsCatalog > I20250309 09:44:28.931007 22790 hms_notification_log_listener.cc:222] > durable_event_id = -1, batch_size = 100 > I20250309 09:44:28.936439 22788 hms_client.cc:369] Fetching 100 HMS events > from id -1{noformat} > This means there is a race between initializing hms_catalog_ > [here|https://github.com/apache/kudu/blob/e742f86f6d8e687dd02d9891f33e068477163016/src/kudu/master/catalog_manager.cc#L1043] > and using it > [here|https://github.com/apache/kudu/blob/e742f86f6d8e687dd02d9891f33e068477163016/src/kudu/master/catalog_manager.cc#L1444]. > hms_notification_log_event_id_ is not loaded correctly and keeps using its > original value -1. The hms-notification-log-listener thread in Kudu-master > will start fetching all HMS notification events due to this. In my local env, > there are lots of HMS notification events that have a large message body > causing HMS OOM to serve these requests. So > HmsNotificationLogListenerTask::Poll() never succeeds and keep polling events > from id -1. Attached the master logs: > [^kudu-master.quanlong-OptiPlex-BJ.quanlong.log.INFO.20250309-094427.22679] > Due to this, creating managed tables in Kudu will failed with the following > error. IMPALA-13846 is an example. > {code} > failed to wait for Hive Metastore notification log listener to catch up: > failed to retrieve notification log events: failed to get Hive Metastore next > notification: No more data to read.{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)