Re: Anyway to avoid creating subdirectories by "Insert with union²

2016-02-24 Thread Gopal Vijayaraghavan
> SET mapred.input.dir.recursive=TRUE; ... > Can we set above setting as tblProperties or Hive Table properties. Not directly, those are MapReduce properties - they are not settable via Hive tables. That said, you can write your own SemanticAnalyzerHooks to do pretty much anything you want like t

Null pointer error with UNION ALL of Sub Queries

2016-02-24 Thread mahender bigdata
Hi, We are using Hive 1.2 version, I get Null pointer exception whenever i use UNION ALL along with Tez execution engine. I see there is JIRA raised for this, https://issues.apache.org/jira/browse/HIVE-7765. reason for getting exception is because of one of table has zero entries. Is this iss

Re: Using Spark functional programming rather than SQL, Spark on Hive tables

2016-02-24 Thread Mich Talebzadeh
Well spotted Sab. You are correct. An oversight by me. They should both use "sales". The results are now comparable The following statement "On the other hand using SQL the query 1 takes 19 seconds compared to just under 4 minutes for functional programming The seconds query using SQL ta

Re: Anyway to avoid creating subdirectories by "Insert with union²

2016-02-24 Thread mahender bigdata
Thanks Gopal. This is a architectural change from Hive 0.13 to hive 1.2. We are migrating our hive query from 0.13 to 1.2. Previously it is running perfectly against 0.13 but same query in 1.2 is failing due to union/union-all performance improvement. because of creation of sub directories. W

Re: Hive 2 performance

2016-02-24 Thread Mich Talebzadeh
Correct hence the question as I have done some preliminary tests on Hive 2. I want to share insights with other people who have performed the same HTH On 24/02/2016 17:33, Jörn Franke wrote: > This highly depends on data, optimization and queries and you have to always > do some own tes

Re: Hive 2 performance

2016-02-24 Thread Jörn Franke
This highly depends on data, optimization and queries and you have to always do some own tests. You can of course use the public hive benchmark tools, but in the end you have to fit it to your situation. > On 24 Feb 2016, at 18:31, Mich Talebzadeh > wrote: > > well I meant how fast it returns

Re: Hive 2 performance

2016-02-24 Thread grimaldi.vince...@gmail.com
Well, he asked for performances... nobody asked for implications. Is it comparable to a MPP dbms or still slow because for the map teduce / tez limits? On 24 Feb 2016 17:25, "Jörn Franke" wrote: > I am not sure what you are looking for. Performance has many influence > factors... > > On 24 Feb 20

Re: Hive 2 performance

2016-02-24 Thread Mich Talebzadeh
well I meant how fast it returns the results in this case compare to 1.2.1 etc thanks On 24/02/2016 17:25, Jörn Franke wrote: > I am not sure what you are looking for. Performance has many influence > factors... > > On 24 Feb 2016, at 18:23, Mich Talebzadeh > wrote: > >> Hi, >> >>

Re: Hive 2 performance

2016-02-24 Thread Jörn Franke
I am not sure what you are looking for. Performance has many influence factors... > On 24 Feb 2016, at 18:23, Mich Talebzadeh > wrote: > > Hi, > > > > Has anyone got some performance matrix for Hive 2 from user perspective? > > It looks very impressive on ORC tables. > > thanks > > --

Hive 2 performance

2016-02-24 Thread Mich Talebzadeh
Hi, Has anyone got some performance matrix for Hive 2 from user perspective? It looks very impressive on ORC tables. thanks -- Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com NOTE: The in

Re: Using Spark functional programming rather than SQL, Spark on Hive tables

2016-02-24 Thread Mich Talebzadeh
HI, TOOLS SPARK 1.5.2, HADOOP 2.6, HIVE 2.0, SPARK-SHELL, HIVE DATABASE OBJECTIVES: TIMING DIFFERENCES BETWEEN RUNNING SPARK USING SQL AND RUNNING SPARK USING FUNCTIONAL PROGRAMING (FP) (FUNCTIONAL CALLS) ON HIVE TABLES UNDERLYING TABLES: THREE TABLES IN HIVE DATABASE USING ORC FORMAT

Re: DML in HIVE using the Java API

2016-02-24 Thread Mich Talebzadeh
Hi, How about using sql in beeline? On 24/02/2016 12:41, Daniel Klinger wrote: > I'm writing an JAVA-Application whitch does DDL and DML in Hive tables. For > DDL is use the Hive-Class org.apache.hadoop.hive.ql.metadata.Hive which is > puplic since Version 1.0. It's perfect for DDL and i

DML in HIVE using the Java API

2016-02-24 Thread Daniel Klinger
I'm writing an JAVA-Application whitch does DDL and DML in Hive tables. For DDL is use the Hive-Class org.apache.hadoop.hive.ql.metadata.Hive which is puplic since Version 1.0. It's perfect for DDL and i think faster than JDBC and other options. But i couldn't find out how to do DML in JAVA (parti