[ https://issues.apache.org/jira/browse/HIVE-18940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399869#comment-16399869 ]
Vihang Karajgaonkar commented on HIVE-18940: -------------------------------------------- I think part of the issue is that the event id generation mechanism is done using a different table which always has 1 row {{NOTIFICATION_SEQUENCE}} Why can't we use auto-increment directly on the {{NOTIFICATION_LOG}} table for event id? I know derby doesn't support auto-increment but I think most production clusters will be not using derby. Designing a feature just to make it work for derby does not seem to be a good idea. If derby doesn't support auto-increments we should treat it as an exception and handle that in the code separately. We should also separate event ID and commit id for an event. Currently, the event id is strictly tied with the actual commit time which is why we have to hold the lock at the generation time until the transaction commits which in theory could take a long time. Also, the timing of generating the event id and actual commit is non-obvious in the code. So it is easy to miss that while writing the code. I think it would be great to use something like auto-increment to just uniquely identify a notification log message. The actual commit id should be generated from a global monotonically increasing number at the actual commit time. This number should apply to all the events pertaining to that transaction. A transaction which alters 1000 partitions should not have 1000 different ids because they were not committed one by one in 1000 transactions. They were all committed as one transaction and hence ideally should only generate one commit id. This would greatly help with the lock durations for long transactions because now commit lock is held for a constant time irrespective of how long that transaction ran. > Hive notifications serialize all write DDL operations > ----------------------------------------------------- > > Key: HIVE-18940 > URL: https://issues.apache.org/jira/browse/HIVE-18940 > Project: Hive > Issue Type: Bug > Components: Metastore > Affects Versions: 3.0.0 > Reporter: Alexander Kolbasov > Priority: Major > > The implementation of DbNotificationListener uses a single row to store > current notification ID and uses {{SELECT FOR UPDATE}} to lock the row. This > serializes all write DDL operations which isn't good. > We should consider using database auto-increment for notification ID instead. > Especially on mMySQL/innoDb it is supported natively with relatively > light-weight locking. > This creates potential issue for consumers though because such IDs may have > holes. There are two types of holes - transient hole for a transaction which > have not committed yet and will be committed shortly and permanent holes for > transactions that fail. Consumers need to deal with it. It may be useful to > add DB-generated timestamp as well to assist in recovery from holes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)