[ 
https://issues.apache.org/jira/browse/HIVE-18940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399869#comment-16399869
 ] 

Vihang Karajgaonkar commented on HIVE-18940:
--------------------------------------------

I think part of the issue is that the event id generation mechanism is done 
using a different table which always has 1 row {{NOTIFICATION_SEQUENCE}} Why 
can't we use auto-increment directly on the {{NOTIFICATION_LOG}} table for 
event id? I know derby doesn't support auto-increment but I think most 
production clusters will be not using derby. Designing a feature just to make 
it work for derby does not seem to be a good idea. If derby doesn't support 
auto-increments we should treat it as an exception and handle that in the code 
separately. We should also separate event ID and commit id for an event. 
Currently, the event id is strictly tied with the actual commit time which is 
why we have to hold the lock at the generation time until the transaction 
commits which in theory could take a long time. Also, the timing of generating 
the event id and actual commit is non-obvious in the code. So it is easy to 
miss that while writing the code.

I think it would be great to use something like auto-increment to just uniquely 
identify a notification log message. The actual commit id should be generated 
from a global monotonically increasing number at the actual commit time. This 
number should apply to all the events pertaining to that transaction. A 
transaction which alters 1000 partitions should not have 1000 different ids 
because they were not committed one by one in 1000 transactions. They were all 
committed as one transaction and hence ideally should only generate one commit 
id. This would greatly help with the lock durations for long transactions 
because now commit lock is held for a constant time irrespective of how long 
that transaction ran.

> Hive notifications serialize all write DDL operations
> -----------------------------------------------------
>
>                 Key: HIVE-18940
>                 URL: https://issues.apache.org/jira/browse/HIVE-18940
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 3.0.0
>            Reporter: Alexander Kolbasov
>            Priority: Major
>
> The implementation of DbNotificationListener uses a single row to store 
> current notification ID and uses {{SELECT FOR UPDATE}} to lock the row. This 
> serializes all write DDL operations which isn't good.
> We should consider using database auto-increment for notification ID instead. 
> Especially on mMySQL/innoDb it is supported natively with relatively 
> light-weight locking. 
> This creates potential issue for consumers though because such IDs may have 
> holes. There are two types of holes - transient hole for a transaction which 
> have not committed yet and will be committed shortly and permanent holes for 
> transactions that fail. Consumers need to deal with it. It may be useful to 
> add DB-generated timestamp as well to assist in recovery from holes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to