[ 
https://issues.apache.org/jira/browse/KUDU-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933599#comment-17933599
 ] 

Quanlong Huang commented on KUDU-3649:
--------------------------------------

Uploaded a fix for review: https://gerrit.cloudera.org/c/22602/

> Last processed Hive Metastore notification event ID is not loaded correctly
> ---------------------------------------------------------------------------
>
>                 Key: KUDU-3649
>                 URL: https://issues.apache.org/jira/browse/KUDU-3649
>             Project: Kudu
>          Issue Type: Bug
>          Components: server
>            Reporter: Quanlong Huang
>            Priority: Critical
>         Attachments: kudu-debug.patch, 
> kudu-master.quanlong-OptiPlex-BJ.quanlong.log.INFO.20250309-094427.22679
>
>
> While launching kudu-master with Hive Metastore integration enabled, I don't 
> see the following log:
> {code:cpp}
>     if (hms_catalog_) {
>       static const char* const kNotificationLogEventIdDescription =
>           "Loading latest processed Hive Metastore notification log event ID";
>       LOG(INFO) << kNotificationLogEventIdDescription << "...";{code}
> https://github.com/apache/kudu/blob/e742f86f6d8e687dd02d9891f33e068477163016/src/kudu/master/catalog_manager.cc#L1446-L1447
> The kudu version used in Impala's test env is commit e742f86f6. I tried 
> patching kudu to add more debug logs by  [^kudu-debug.patch]. Then I see the 
> following logs:
> {noformat}
> I20250309 09:44:28.930799 22774 catalog_manager.cc:1434] Initializing 
> in-progress tserver states...
> I20250309 09:44:28.930833 22774 catalog_manager.cc:1455] hms_catalog_ is 
> nullptr
> I20250309 09:44:28.930859 22765 hms_catalog.cc:109] Initializing HmsCatalog
> I20250309 09:44:28.931007 22790 hms_notification_log_listener.cc:222] 
> durable_event_id = -1, batch_size = 100 
> I20250309 09:44:28.936439 22788 hms_client.cc:369] Fetching 100 HMS events 
> from id -1{noformat}
> This means there is a race between initializing hms_catalog_ 
> [here|https://github.com/apache/kudu/blob/e742f86f6d8e687dd02d9891f33e068477163016/src/kudu/master/catalog_manager.cc#L1043]
>  and using it 
> [here|https://github.com/apache/kudu/blob/e742f86f6d8e687dd02d9891f33e068477163016/src/kudu/master/catalog_manager.cc#L1444].
> hms_notification_log_event_id_ is not loaded correctly and keeps using its 
> original value -1. The hms-notification-log-listener thread in Kudu-master 
> will start fetching all HMS notification events due to this. In my local env, 
> there are lots of HMS notification events that have a large message body 
> causing HMS OOM to serve these requests. So 
> HmsNotificationLogListenerTask::Poll() never succeeds and keep polling events 
> from id -1. Attached the master logs: 
> [^kudu-master.quanlong-OptiPlex-BJ.quanlong.log.INFO.20250309-094427.22679]
> Due to this, creating managed tables in Kudu will failed with the following 
> error. IMPALA-13846 is an example.
> {code}
> failed to wait for Hive Metastore notification log listener to catch up: 
> failed to retrieve notification log events: failed to get Hive Metastore next 
> notification: No more data to read.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to