why does HIVE can run normally without starting yarn?

2016-05-31 Thread Joseph
Hi all, I use hadoop 2.7.2, and I just start HDFS, then I can submit mapreduce jobs and run HIVE 1.2.1. Do the jobs just execute locally If I don't start YARN? Joseph

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-31 Thread Mich Talebzadeh
Thanks Gopal. SAP Replication server (SRS) does it to Hive real time as well. That is the main advantage of replication as it is real time. Picks up committed data from the log and sends it to hive as well. Also it ois way ahead of Sqoop that only does the initial load really. It does 10k rows at

Re: [ANNOUNCE] Apache Hive 2.0.1 Released

2016-05-31 Thread Sergey Shelukhin
Oh. I just copy-pasted the Wiki text, perhaps it should be updated. From: Mich Talebzadeh mailto:mich.talebza...@gmail.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Tuesday, May 31, 2016 at 14:01 To: user mailto:user@hive.apache.org>> Cc:

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-31 Thread Gopal Vijayaraghavan
> Can LLAP be used as a caching tool for data from Oracle DB or any RDBMS. No, LLAP intermediates HDFS. It holds column & index data streams as-is (i.e dictionary encoding, RLE, bloom filters etc are preserved). Because it does not cache row-tuples, it cannot exist as a caching tool for another

Fwd: [ANNOUNCE] Apache Hive 2.0.1 Released

2016-05-31 Thread Mich Talebzadeh
Thanks Sergey, Congratulations. May I add that Hive 0.14 and above can also deploy Spark as its executions engine and with Spark on Hive on Spark execution engine you have a winning combination. BTW we are just discussing the merits of TEZ + LLAP versus Spark as the execution engine for Spark. W

Re: [ANNOUNCE] Apache Hive 2.0.1 Released

2016-05-31 Thread Mich Talebzadeh
Thanks Sergey, Congratulations. May I add that Hive 0.14 and above can also deploy Spark as its executions engine and with Spark on Hive on Spark execution engine you have a winning combination. BTW we are just discussing the merits of TEZ + LLAP versus Spark as the execution engine for Spark. W

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-31 Thread Mich Talebzadeh
Thanks for that Gopal. Can LLAP be used as a caching tool for data from Oracle DB or any RDBMS. In that case does it use JDBC to get the data out from the underlying DB? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-31 Thread Gopal Vijayaraghavan
> but this sounds to me (without testing myself) adding caching capability >to TEZ to bring it on par with SPARK. Nope, that was the crux of the earlier email. "Caching" seems to be catch-all term misused in that comparison. >> There is a big difference between where LLAP & SparkSQL, which has

[ANNOUNCE] Apache Hive 2.0.1 Released

2016-05-31 Thread Sergey Shelukhin
The Apache Hive team is proud to announce the the release of Apache Hive version 2.0.1. The Apache Hive (TM) data warehouse software facilitates querying and managing large datasets residing in distributed storage. Built on top of Apache Hadoop (TM), it provides: * Tools to enable easy data extra

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-31 Thread Mich Talebzadeh
Couple of points if I may and kindly bear with my remarks. Whilst it will be very interesting to try TEZ with LLAP. As I read from LLAP "Sub-second queries require fast query execution and low setup cost. The challenge for Hive is to achieve this without giving up on the scale and flexibility tha

RE: How to disable SMB join?

2016-05-31 Thread Markovitz, Dudu
Hi The documentation describes a scenario where SMB join leads to the same error you’ve got. It claims that changing the order of the tables solves the problem. Dudu https://cwiki.apache.org/confluence/display/Hive/LanguageManual+JoinOptimization#LanguageManualJoinOptimization-SMBJoinacrossTab

How to disable SMB join?

2016-05-31 Thread Banias H
Hi, Does anybody know if there a config setting to disable SMB join? One of our Hive queries failed with ArrayIndexOutOfBoundsException when Tez is the execution engine. The error seems to be addressed by https://issues.apache.org/jira/browse/HIVE-13282 We have Hive 1.2 and Tez 0.7 in our cluste

Re: Why does the user need write permission on the location of external hive table?

2016-05-31 Thread Mich Talebzadeh
right that directly belongs to hdfs:hdfs and nonone else bar that user can write to it. if you are connecting via beeline you need to specify the user and password beeline -u jdbc:hive2://rhes564:10010/default org.apache.hive.jdbc.HiveDriver -n hduser -p When I look at permissioning I see o

Re: Why does the user need write permission on the location of external hive table?

2016-05-31 Thread Sandeep Giri
Yes, when I run hadoop fs it gives results correctly. *hadoop fs -ls /data/SentimentFiles/SentimentFiles/upload/data/tweets_raw/* *Found 30 items* *-rw-r--r-- 3 hdfs hdfs 6148 2015-12-04 15:19 /data/SentimentFiles/SentimentFiles/upload/data/tweets_raw/.DS_Store* *-rw-r--r-- 3 hdfs hdfs

Re: Why does the user need write permission on the location of external hive table?

2016-05-31 Thread Mich Talebzadeh
is this location correct and valid? LOCATION '/data/SentimentFiles/*SentimentFiles*/upload/data/tweets_raw/' Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Why does the user need write permission on the location of external hive table?

2016-05-31 Thread Sandeep Giri
Hi Hive Team, As per my understanding, in Hive, you can create two kinds of tables: Managed and External. In case of managed table, you own the data and hence when you drop the table the data is deleted. In case of external table, you don't have ownership of the data and hence when you delete su

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-31 Thread Jörn Franke
Thanks very interesting explanation. Looking forward to test it. > On 31 May 2016, at 07:51, Gopal Vijayaraghavan wrote: > > >> That being said all systems are evolving. Hive supports tez+llap which >> is basically the in-memory support. > > There is a big difference between where LLAP & Spark