Query performance correlated to increase in delta files?

2015-11-19 Thread Sai Gopalakrishnan
Hello fellow developer, Greetings! I am using Hive for querying transactional data. I transfer data from RDBMS to Hive using Sqoop and prefer the ORC format for speed and its ACID properties. I found out that Sqoop has no support for reflecting the updated and deleted records in RDBMS and henc

Re: How to capture query log and duration

2015-11-19 Thread Gopal Vijayaraghavan
> We would like to capture some information in our Hadoop Cluster. > Can anybody please suggest how we can we achieve this, any tools >available already ? Or do we need to scrub any log ? Apache Atlas is the standardized solution for deeper analytics into data ownership/usage (look at the HiveHoo

Re: Hive version with Spark

2015-11-19 Thread Jone Zhang
*-Phive is e**nough* *-Phive will use hive1.2.1 default on Spark1.5.0+* 2015-11-19 4:50 GMT+08:00 Udit Mehta : > As per this link : > https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started, > you need to build Spark without Hive. > > On Wed, Nov 18, 2015 at 8:50 AM, Sof

Re: Building Spark to use for Hive on Spark

2015-11-19 Thread Jone Zhang
I should add that Spark1.5.0+ is used hive1.2.1 default when you use -Phive So this page shoule write like below “Note that you must have a version of Spark which does *not* include the Hive jars if you use Spark1.

How to capture query log and duration

2015-11-19 Thread Rajit Saha
Hi We would like to capture some information in our Hadoop Cluster. Can anybody please suggest how we can we achieve this, any tools available already ? Or do we need to scrub any log ? 1. We want to know how many queries are run in everyday 2. What are the durations of those queries .

Re: [VOTE] Hive 2.0 release plan

2015-11-19 Thread Sergey Shelukhin
Hmm. I looked at the JIRAs targeting the release and it looks like there’s large number of features still pending. I am going to postpone creating the branch to next week. I am also going to unassign JIRAs from the release at that time. On 15/11/16, 18:09, "Sergey Shelukhin" wrote: >With 8 bindi

Re: hive failure after HDP 2.3 upgrade

2015-11-19 Thread Brian Jeltema
I did. Besides, the query is broken, not the schema. If I perform this query from the mysql command prompt, the escape has to be specified as ESCAPE ‘\\' > On Nov 19, 2015, at 6:46 PM, Artem Ervits wrote: > > Confirm Hive metastore schema update scripts have run. > > On Nov 19, 2015 11:39 AM,

Re: hive failure after HDP 2.3 upgrade

2015-11-19 Thread Artem Ervits
Confirm Hive metastore schema update scripts have run. On Nov 19, 2015 11:39 AM, "Brian Jeltema" wrote: > Following up, I turned on logging in the MySQL server to capture the > failing query. The query being logged by MySQL is > > SELECT `A0`.`NAME` AS NUCORDER0 FROM `DBS` `A0` WHERE > LOWER(`A

Re: troubleshooting: "unread block data' error

2015-11-19 Thread Xuefu Zhang
Are you able to run queries that are not touching HBase? This problem were seen before but fixed. On Tue, Nov 17, 2015 at 3:37 AM, Sofia wrote: > Hello, > > I have configured Hive to work Spark. > > I have been trying to run a query on a Hive table managing an HBase table > (created via HBaseSto

can not create a table

2015-11-19 Thread jim Zhou
Hi, can someone help me on this. I am doing this drop table if exists TestOutputCSV; Create table rawEventExtCSV row format delimited FIELDS TERMINATED BY ',' lines terminated by '\n' as select v.id, v.sid, v.timestamp from t3 jt LATERAL VIEW json_tuple(jt.value, 'id', 'sid', 'timestamp') v as id

Re: hive failure after HDP 2.3 upgrade

2015-11-19 Thread Brian Jeltema
Following up, I turned on logging in the MySQL server to capture the failing query. The query being logged by MySQL is SELECT `A0`.`NAME` AS NUCORDER0 FROM `DBS` `A0` WHERE LOWER(`A0`.`NAME`) LIKE '_%' ESCAPE '\' ORDER BY NUCORDER0 which I believe is failing because the backslash in the ESCAP

hive failure after HDP 2.3 upgrade

2015-11-19 Thread Brian Jeltema
Originally posted in the Ambari users group, but probably more appropriate here: I’ve done a rolling upgrade to HDP 2.3 and everything appears to be working now except for Hive. The HiveServer2 process is shown as ‘Started’, but it’s really broken, as is the Hive Metastore. HiveServer2 is not li

[ANN] Hivemall v0.4.0 is now available

2015-11-19 Thread Makoto Yui
Hello all, We released a newer version of Hivemall, v0.4.0. Hivemall provides machine learning functionality over Hive UDFs/UDAFs/UDTFs. Hivemall is easy to use because every machine learning step is done within HiveQL. https://github.com/myui/hivemall In the latest release (v0.4.0), we intr