Hive query translation to tez execution code

2015-08-09 Thread Grandl Robert
Hi guys, In case of MapReduce, we can have access to the class files for Map/Reduce tasks which are actually executed by a map/reduce job. I am wondering how this translates to running hive queries atop Tez. Basically I am trying to snapshot the actual code which is executed by a Tez task, ass

Re: run tpcds queries

2014-09-17 Thread Grandl Robert
Hmm. It seems that the problem is some columns does not exist in tpcds_text DB, so it requires to rename them accordingly. Is this the case and the right fix ? On Wednesday, September 17, 2014 3:45 PM, Grandl Robert wrote: Hi guys, A slightly off-topic question, but not sure where

run tpcds queries

2014-09-17 Thread Grandl Robert
Hi guys, A slightly off-topic question, but not sure where to ask. Have you ever ran TPC-DS queries over Tez ? If yes, do they work using tpcds_text_* databases as well ? Running tpch- queries work fine over tpch_text_* databases, but it looks like tpcds- queries works only when using ORC da

speculative execution in Hive over Tez

2014-09-14 Thread Grandl Robert
Hi guys, I am wondering how to enable speculative execution when running Hive queries over Tez. In MapReduce there are parameters for map/reduce tasks in mapred-site.xml. For Hive, it looks like they are similar when running over MapReduce. However, I cannot find anything related to enable spe

Re: hive unable to use metastore database

2014-08-28 Thread Grandl Robert
I figured out what was happening. As default, in ubuntu, mysql has the bind-address set to localhost. So you need to go into /etc/mysql/my.cnf and comment the line: bind-address = 127.0.0.1 Robert On Thursday, August 28, 2014 3:32 PM, Grandl Robert wrote: Hi guys, I was trying to

hive unable to use metastore database

2014-08-28 Thread Grandl Robert
Hi guys, I was trying to configure the metastore database, following the steps: 1. sudo apt-get install mysql-server 2. sudo service mysql start 3. sudo apt-get install libmysql-java -> copied /user/share/java/libmysql-*.jar to $HIVE_HOME/lib 4. sudo /usr/bin/mysql_secure_installation [...] Ent

Re: pass new job name to tez

2014-08-25 Thread Grandl Robert
I created another variable hive.job.name and set it from sql script and works. Thanks, Robert On Monday, August 25, 2014 5:40 PM, Grandl Robert wrote: Hi guys, I am still struggling with finding a way to specify a job name in the hive query and propagate that name in Tez dag. For

Re: pass new job name to tez

2014-08-25 Thread Grandl Robert
hanks, Navis 2014-07-10 12:43 GMT+09:00 Grandl Robert : Hi guys, > > >I am trying to identify a DAG in Tez with a different id, based on job name(for e.g. query55.sql from hive-testbench) + input size. > > > >So my new identifier should be for example query55_2048MB. It s

hive atop tez 0.5.0

2014-08-25 Thread Grandl Robert
Hi guys, I have a hard time running hive atop tez 0.5. I found from tez mailinglist that the only version of hive compatible with tez 0.5 is the tez branch from hive source tree(so not hive 0.13). However, I did a git checkout tez in hive source tree branch, but when trying to compile, I got t

pass new job name to tez

2014-07-09 Thread Grandl Robert
Hi guys, I am trying to identify a DAG in Tez with a different id, based on job name(for e.g. query55.sql from hive-testbench) + input size. So my new identifier should be for example query55_2048MB. It seems that a DAG in tez, already takes a name which comes from a jobPlan.getName() passed

[no subject]

2014-07-09 Thread Grandl Robert
Hi guys, I am trying to identify a DAG in Tez with a different id, based on job name(for e.g. query55.sql from hive-testbench) + input size. So my new identifier should be for example query55_2048MB. It seems that a DAG in tez, already takes a name which comes from a jobPlan.getName() passed

Hive -> pass new job name to tez

2014-07-09 Thread Grandl Robert
Hi guys, I am trying to identify a DAG in Tez with a different id, based on job name(for e.g. query55.sql from hive-testbench) + input size. So my new identifier should be for example query55_2048MB. It seems that a DAG in tez, already takes a name which comes from a jobPlan.getName() passed

Re: DDLTask. Database does not exist:

2014-06-21 Thread Grandl Robert
here, the DB is recognized. If I do  hadoop@nectar-11:~/rgrandl/hive/hive-testbench$ cd sample-queries-tpcds/ and start hive again, then the DB does not exist. Any idea why is that ? On Saturday, June 21, 2014 4:30 PM, Grandl Robert wrote: Hi guys, I have created a database using

DDLTask. Database does not exist:

2014-06-21 Thread Grandl Robert
Hi guys, I have created a database using this link: https://github.com/cartershanklin/hive-testbench, which basically creates a database: tpcds_bin_partitioned_orc_20.db and input data using TPC-DS benchmark. The database is visible in HDFS under /user/hive/warehouse/tpcds_bin_partitioned_o

Fw: hive + tez + yarn 2.4

2014-06-18 Thread Grandl Robert
Not sure whether hive or tez mailing list is the best place to ask. Thanks, robert On Wednesday, June 18, 2014 10:25 AM, Grandl Robert wrote: Hi guys, I was trying to run hive atop tez atop yarn 2.4. Setting mapreduce.framework.name to yarn-tez enables tez execution engine and I can

question regarding the DAG scheduler

2014-04-15 Thread Grandl Robert
Hi, Can someone give me some details(or some pointers) on how the DAG scheduling happens in Hive ? Like: in what order are the tasks(w/o dependencies) chosen for scheduling; how the assignment between tasks and nodes happens, etc ... Thanks, robert