Hey

Instead of going into HIVE CLI
I would propose 2 ways

NOHUP
nohup hive -f path/to/query/file/hive1.hql >> ./hive1.hql_`date 
+%Y-%m-%d-%H–%M–%S`.log 2>&1
nohup hive -f path/to/query/file/hive2.hql >> ./hive2.hql_`date 
+%Y-%m-%d-%H–%M–%S`.log 2>&1
nohup hive -f path/to/query/file/hive3.hql >> ./hive3.hql_`date 
+%Y-%m-%d-%H–%M–%S`.log 2>&1
nohup hive -f path/to/query/file/hive4.hql >> ./hive4.hql_`date 
+%Y-%m-%d-%H–%M–%S`.log 2>&1
nohup hive -f path/to/query/file/hive5.hql >> ./hive5.hql_`date 
+%Y-%m-%d-%H–%M–%S`.log 2>&1

Each statement above will launch MR jobs on your cluster and depending on the 
cluster configs the jobs will run parallelly
Scheduling jobs on the MR cluster is independent of Hive

SCREEN sessions

  *   Create a Screen session
     *   screen  –S  hive_query1
     *   U r inside the screen session hive_query1
        *   hive -f path/to/query/file/hive1.hql
     *   Ctrl A D
        *   U detach from a screen session
  *   Repeat for each hive query u want to run
     *   I.e. Say 5 screen sessions, each running a have query
  *   To display screen session active
     *   screen -x
  *   To attach to a screen session
     *   screen  -x hive_query1

Thanks
Warm Regards

Sanjay

From: saurabh <mpp.databa...@gmail.com<mailto:mpp.databa...@gmail.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Monday, April 21, 2014 at 1:53 PM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Executing Hive Queries in Parallel


Hi,
I need some inputs to execute hive queries in parallel. I tried doing this 
using CLI (by opening multiple ssh connection) and executed 4 HQL's; it was 
observed that the queries are getting executed sequentially. All the FOUR 
queries got submitted however while the first one was in execution mode the 
other were in pending state. I was performing this activity on the EMR running 
on Batch mode hence didn't able to dig into the logs.

The hive CLI uses native hive connection which by default uses the FIFO 
scheduler.  This might be one of the reason for the queries getting executed in 
sequence.

I also observed that when multiple queries are executed using multiple HUE 
sessions, it provides the parallel execution functionality. Can you please 
suggest how the functionality of HUE can be replicated using CLI?

I am aware of beeswax client however i am not sure how this can be used during 
EMR- batch mode processing.

Thanks in advance for going through this. Kindly let me know your thoughts on 
the same.

Reply via email to