[ https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16135733#comment-16135733 ]
Sergio Peña commented on HIVE-16886: ------------------------------------ [~anishek] [~thejas] Thanks for starting working on this. I have a few of comments about using NL_ID that we should keep in mind on the patch. This is based on some other tests we have done with datastore incremental IDs. 1. Use the {{sequence}} strategy on the NL_ID. This should keep increments atomic. However, we have found that these increments do not guarantee an order when committing them. We have seen commits ID of order 1,3,2 because of some delay in the transaction commit. We call this out-of-order as 'holes' because when fetching updates we may get 1,3 only (2 is not there because will be committed later). 2. To solve the holes problem on our client requests when getting new notifications, we should request them in a temporal order instead of ID order. To do that, we should write a new API to get notifications based on a timestamp instead of an ID. Timestamps will keep the order correctly. 3. Use a SQL timestamp instead of using the one on DbNotificationListener#now() method. If possible, use a milliseconds timestamp, such as {{current_timestamp(6)}} from SQL. The now() method run in different HMS servers may be different and may be out-of-sync having a weird order sometimes. We found that {{current_timestamp(6)}} is executed at the moment of the SQL execution; so it is the best time we can get of the transaction. This next section only applies to what we do as a client requesting notifications (not part for the patch, but useful to know). On the client side, we sometimes do not request notifications for a period higher than the HMS clean-up thread. This means that we may miss notifications that were removed during that time. To avoid this issue on our side, we're requesting HMS notifications for a time window period and reapplying all of them for that time. If for some reason we get fewer notifications than expected, then we assume that older events were purged, and we may request a new HMS snapshot if necessary. > HMS log notifications may have duplicated event IDs if multiple HMS are > running concurrently > -------------------------------------------------------------------------------------------- > > Key: HIVE-16886 > URL: https://issues.apache.org/jira/browse/HIVE-16886 > Project: Hive > Issue Type: Bug > Components: Hive, Metastore > Reporter: Sergio Peña > Assignee: anishek > > When running multiple Hive Metastore servers and DB notifications are > enabled, I could see that notifications can be persisted with a duplicated > event ID. > This does not happen when running multiple threads in a single HMS node due > to the locking acquired on the DbNotificationsLog class, but multiple HMS > could cause conflicts. > The issue is in the ObjectStore#addNotificationEvent() method. The event ID > fetched from the datastore is used for the new notification, incremented in > the server itself, then persisted or updated back to the datastore. If 2 > servers read the same ID, then these 2 servers write a new notification with > the same ID. > The event ID is not unique nor a primary key. > Here's a test case using the TestObjectStore class that confirms this issue: > {noformat} > @Test > public void testConcurrentAddNotifications() throws ExecutionException, > InterruptedException { > final int NUM_THREADS = 2; > CountDownLatch countIn = new CountDownLatch(NUM_THREADS); > CountDownLatch countOut = new CountDownLatch(1); > HiveConf conf = new HiveConf(); > conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, > MockPartitionExpressionProxy.class.getName()); > ExecutorService executorService = > Executors.newFixedThreadPool(NUM_THREADS); > FutureTask<Void> tasks[] = new FutureTask[NUM_THREADS]; > for (int i=0; i<NUM_THREADS; i++) { > final int n = i; > tasks[i] = new FutureTask<Void>(new Callable<Void>() { > @Override > public Void call() throws Exception { > ObjectStore store = new ObjectStore(); > store.setConf(conf); > NotificationEvent dbEvent = > new NotificationEvent(0, 0, > EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n); > System.out.println("ADDING NOTIFICATION"); > countIn.countDown(); > countOut.await(); > store.addNotificationEvent(dbEvent); > System.out.println("FINISH NOTIFICATION"); > return null; > } > }); > executorService.execute(tasks[i]); > } > countIn.await(); > countOut.countDown(); > for (int i = 0; i < NUM_THREADS; ++i) { > tasks[i].get(); > } > NotificationEventResponse eventResponse = > objectStore.getNextNotification(new NotificationEventRequest()); > Assert.assertEquals(2, eventResponse.getEventsSize()); > Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId()); > // This fails because the next notification has an event ID = 1 > Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId()); > } > {noformat} > The last assertion fails expecting an event ID 1 instead of 2. -- This message was sent by Atlassian JIRA (v6.4.14#64029)