[ 
https://issues.apache.org/jira/browse/HIVE-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-2254:
-----------------------------------

    Attachment: zkclient-0.1.0.jar

> Provide an automatic recovery feature for Hive Server in case of failure
> ------------------------------------------------------------------------
>
>                 Key: HIVE-2254
>                 URL: https://issues.apache.org/jira/browse/HIVE-2254
>             Project: Hive
>          Issue Type: New Feature
>          Components: Clients, Query Processor, Server Infrastructure
>    Affects Versions: 0.5.0, 0.7.1
>         Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise 
> Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5)
>            Reporter: Chinna Rao Lalam
>            Assignee: Chinna Rao Lalam
>         Attachments: HIVE-2254.patch, Hive Automatic Recovery Solution.pdf, 
> zk_les.jar, zkclient-0.1.0.jar
>
>
> *Motivation*
> We are doing log analysis using Hive by submitting queries through Hive 
> Server and we have provided Name Node HA and Job tracker HA to achieve the 
> high availability but Currently Hive Server is a single point of failure. If 
> the machine running Hive Server is down or broken, Hive service cannot be 
> availed till someone notice the Hive Sever failure and bring it up till this 
> time our log analysis is not continuing. To avoid this problem we need an 
> automatic system that can detect the failure and make sure of the high 
> availability of the Server.
> *Proposal*
> Deploy two Hive Servers. One of the Hive Server will act as active while the 
> other one will be a Hot Standby. Here we need a system to decide which can be 
> active and which can be standby and a failure detection mechanism it should 
> detect if Active server is down or broken and trigger the switch over 
> (standby to active). This failure detection mechanism will be based on 
> Zookeeper (HA Agent).
> The clients of Hive Server should be configured with the address of both 
> servers. While getting the connection it will detect the Active Hive Server & 
> connect to it.
> While executing query Hive Server is down after starting Hive Server need to 
> submit the query again but already executed query will run in the background. 
> Continuing this query execution is no use so it is wastage of cluster 
> resource. In this solution once active is down standby will become active to 
> server and it will ensure to stop the already executed query execution (Hive 
> tasks & MapRed jobs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to