[ https://issues.apache.org/jira/browse/HIVE-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chinna Rao Lalam updated HIVE-2254: ----------------------------------- Attachment: zkclient-0.1.0.jar > Provide an automatic recovery feature for Hive Server in case of failure > ------------------------------------------------------------------------ > > Key: HIVE-2254 > URL: https://issues.apache.org/jira/browse/HIVE-2254 > Project: Hive > Issue Type: New Feature > Components: Clients, Query Processor, Server Infrastructure > Affects Versions: 0.5.0, 0.7.1 > Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise > Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5) > Reporter: Chinna Rao Lalam > Assignee: Chinna Rao Lalam > Attachments: HIVE-2254.patch, Hive Automatic Recovery Solution.pdf, > zk_les.jar, zkclient-0.1.0.jar > > > *Motivation* > We are doing log analysis using Hive by submitting queries through Hive > Server and we have provided Name Node HA and Job tracker HA to achieve the > high availability but Currently Hive Server is a single point of failure. If > the machine running Hive Server is down or broken, Hive service cannot be > availed till someone notice the Hive Sever failure and bring it up till this > time our log analysis is not continuing. To avoid this problem we need an > automatic system that can detect the failure and make sure of the high > availability of the Server. > *Proposal* > Deploy two Hive Servers. One of the Hive Server will act as active while the > other one will be a Hot Standby. Here we need a system to decide which can be > active and which can be standby and a failure detection mechanism it should > detect if Active server is down or broken and trigger the switch over > (standby to active). This failure detection mechanism will be based on > Zookeeper (HA Agent). > The clients of Hive Server should be configured with the address of both > servers. While getting the connection it will detect the Active Hive Server & > connect to it. > While executing query Hive Server is down after starting Hive Server need to > submit the query again but already executed query will run in the background. > Continuing this query execution is no use so it is wastage of cluster > resource. In this solution once active is down standby will become active to > server and it will ensure to stop the already executed query execution (Hive > tasks & MapRed jobs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira