[ https://issues.apache.org/jira/browse/HIVE-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16397647#comment-16397647 ]
Sergey Shelukhin commented on HIVE-18281: ----------------------------------------- Left comments on RB, mostly minor. I am not sure about timing/races w.r.t. leadership switching. If activities during leadership switch take too long should they time out? etc. I'm hoping/assuming the latch itself has some grace period that should be configurable and/or accounted for. There also appear to be some unrelated logic changes in some places. > HiveServer2 HA for LLAP and Workload Manager > -------------------------------------------- > > Key: HIVE-18281 > URL: https://issues.apache.org/jira/browse/HIVE-18281 > Project: Hive > Issue Type: New Feature > Affects Versions: 3.0.0 > Reporter: Prasanth Jayachandran > Assignee: Prasanth Jayachandran > Priority: Major > Attachments: HIVE-18281.1.patch, HIVE-18281.2.patch, > HIVE-18281.WIP.patch, HSI-HA.pdf > > > When running HS2 with LLAP and Workload Manager, HS2 becomes single point of > failure as some of the states for workload management and scheduling are > maintained in-memory. > The proposal is to support Active/Passive mode of high availability in which, > all HS2 and tez AMs registers with ZooKeeper and a leader have to be chosen > which will maintain stateful information. Clients using service discovery > will always connect to the leader for submitting queries. There will also be > some responsibilities for the leader, failover handling, tez session > reconnect etc. Will upload some more detailed information in a separate doc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)