date:20160628

Re: Tez jobs on YARN failing sporadically..

2016-06-28 Thread saquib khan

Unsubscribe On Tuesday, June 28, 2016, Gautam wrote: > Hello, > > We have Tez being used for one of our main ETL workflows and have been > using it for couple months now. We recently started seeing the following > error for a query that regularly runs and hasn't been changed in any way. > It's a

Re: Tez jobs on YARN failing sporadically..

2016-06-28 Thread Gautam

*Software Versions* - Hive : 1.1.0 - Tez : 0.7.1 - Hadoop : 2.6.0 On Tue, Jun 28, 2016 at 5:58 PM, Gautam wrote: > Hello, > > We have Tez being used for one of our main ETL workflows and have been > using it for couple months now. We recently started seeing the following > error for a query tha

Tez jobs on YARN failing sporadically..

2016-06-28 Thread Gautam

Hello, We have Tez being used for one of our main ETL workflows and have been using it for couple months now. We recently started seeing the following error for a query that regularly runs and hasn't been changed in any way. It's a job that counts an hour's worth of data in a M-R-R flow. This erro

Re: Query Performance Issue : Group By and Distinct and load on reducer

2016-06-28 Thread @Sanjiv Singh

thanks a lot. let me give it a try. Regards Sanjiv Singh Mob : +091 9990-447-339 On Tue, Jun 28, 2016 at 5:32 PM, Markovitz, Dudu wrote: > There’s a distributed algorithm for windows function that is based on the > ORDER BY clause rather than the PARTITION BY clause. > > I doubt if is implemen

RE: Query Performance Issue : Group By and Distinct and load on reducer

2016-06-28 Thread Markovitz, Dudu

There’s a distributed algorithm for windows function that is based on the ORDER BY clause rather than the PARTITION BY clause. I doubt if is implemented in Hive, but it’s worth a shot. select * ,row_number () over (order by rand()) as ETL_ROW_ID fromINTER_ETL ; For unique

Re: Query Performance Issue : Group By and Distinct and load on reducer

2016-06-28 Thread @Sanjiv Singh

ETL_ROW_ID is to be consecutive number. I need to check if having unique number would not break any logic. Considering unique number for ETL_ROW_ID column, what are optimum options available? What id it has to be consecutive number only? Regards Sanjiv Singh Mob : +091 9990-447-339 On Tue, Ju

RE: Query Performance Issue : Group By and Distinct and load on reducer

2016-06-28 Thread Markovitz, Dudu

I’m guessing ETL_ROW_ID should be unique but not necessarily contain only consecutive numbers? From: @Sanjiv Singh [mailto:sanjiv.is...@gmail.com] Sent: Tuesday, June 28, 2016 10:57 PM To: Markovitz, Dudu Cc: user@hive.apache.org Subject: Re: Query Performance Issue : Group By and Distinct and l

Re: Query Performance Issue : Group By and Distinct and load on reducer

2016-06-28 Thread @Sanjiv Singh

Hi Dudu, You are correct ...ROW_NUMBER() is main culprit. ROW_NUMBER() OVER Not Fast Enough With Large Result Set, any good solution? Regards Sanjiv Singh Mob : +091 9990-447-339 On Tue, Jun 28, 2016 at 3:42 PM, Markovitz, Dudu wrote: > The row_number operation seems to be skewed. > > > >

Hive Query Error: Cannot obtain block length

2016-06-28 Thread Arun Patel

I am trying to do log analytics on the logs created by Flume. Hive queries are failing with below error. "hadoop fs -cat" command works on all these open files. Is there a way to read these open files? My requirement is to read the data from open files too. I am using tez as execution engine.

RE: Query Performance Issue : Group By and Distinct and load on reducer

2016-06-28 Thread Markovitz, Dudu

The row_number operation seems to be skewed. Dudu From: @Sanjiv Singh [mailto:sanjiv.is...@gmail.com] Sent: Tuesday, June 28, 2016 8:54 PM To: user@hive.apache.org Subject: Query Performance Issue : Group By and Distinct and load on reducer Hi All, I am having performance issue with data skew o

RE: Hive error : Can not convert struct<> to

2016-06-28 Thread Markovitz, Dudu

The staging table has no partitions, so no issue there. Also, the error specifically refers to the covertion between the struct types. Dudu FAILED: SemanticException [Error 10044]: Line 2:23 Cannot insert into target table because column number/types are different ''CA'': Cannot convert c

Query Performance Issue : Group By and Distinct and load on reducer

2016-06-28 Thread @Sanjiv Singh

Hi All, I am having performance issue with data skew of the distinct statement in Hive . See below query with DISTINCT operator. *Original Query : * SELECT DISTINCT SD.

Re: What is the best way to store IPv6 address in Hive?

2016-06-28 Thread Devopam Mittra

My best bet will be string data type itself with partitioning to aid partial search. Please do consider the fact that ipv6 address is more complicated than ipv4 in terms of searching . Regards Dev On 28 Jun 2016 9:35 pm, "Igor Kuzmenko" wrote: > Currently I'm using ORC transactional tables, and

What is the best way to store IPv6 address in Hive?

2016-06-28 Thread Igor Kuzmenko

Currently I'm using ORC transactional tables, and i need to store a lot of data containing IP addresses. With IPv4 it can be a Integer (4 bytes exacty), but what about IPv6? Obiously it should be space efficient and easy to search for exact match. As extra feature it would be good to do fast search

Re: External_Tables_Disadvantages

2016-06-28 Thread Ajay Chander

Hi Team, Any insights on this one? Thank you On Monday, June 27, 2016, Ajay Chander wrote: > Hi Everyone, > > I would like to know the disadvantages of using External tables in Hive. I > was told that "Managing security with sentry will be very limited for > external tables" is it true? Can some

Re: Hive error : Can not convert struct<> to

2016-06-28 Thread Gopal Vijayaraghavan

> PARTITION(state='CA') > SELECT * WHERE se.adr.st='CA' > FAILED: SemanticException [Error 10044]: Line 2:23 Cannot insert into >target table because column number/types are different ''CA'': The error is bogus, but the issue has to do with the "SELECT *". Inserts where a partition is specified

RE: Hive error : Can not convert struct<> to

2016-06-28 Thread Markovitz, Dudu

Hi The fields' names are part of the struct definition. Different names, different types of structs. Dudu e.g. Setup create table t1 (s struct); create table t2 (s struct); insert into table t1 select named_struct('c1',1,'c2',2);

WebHCat Hive POST callback not being called

2016-06-28 Thread Pau Tallada

Hi, I'm trying to use the WebHCat REST API to post a long query to Hive, and have an endpoint in my app called upon completion. The POST request is like this: POST /templeton/v1/hive HTTP/1.1 Host: data.astro:50111 Cache-Control: no-cache Postman-Token: 7531c7e9-f0e6-ce58-4482-4f5a0cc98b52 Conte

Hive error : Can not convert struct<> to

2016-06-28 Thread Kuldeep Chitrakar

Hi I have staged table as hive (revise)> desc employees_se; OK namestring salaryfloat subordinates array deductions map adr struct I am trying to insert the data in partitioned table employees as hive (revise)> desc e

Re: Tez jobs on YARN failing sporadically..

Re: Tez jobs on YARN failing sporadically..

Tez jobs on YARN failing sporadically..

Re: Query Performance Issue : Group By and Distinct and load on reducer

RE: Query Performance Issue : Group By and Distinct and load on reducer

Re: Query Performance Issue : Group By and Distinct and load on reducer

RE: Query Performance Issue : Group By and Distinct and load on reducer

Re: Query Performance Issue : Group By and Distinct and load on reducer

Hive Query Error: Cannot obtain block length

RE: Query Performance Issue : Group By and Distinct and load on reducer

RE: Hive error : Can not convert struct<> to

Query Performance Issue : Group By and Distinct and load on reducer

Re: What is the best way to store IPv6 address in Hive?

What is the best way to store IPv6 address in Hive?

Re: External_Tables_Disadvantages

Re: Hive error : Can not convert struct<> to

RE: Hive error : Can not convert struct<> to

WebHCat Hive POST callback not being called

Hive error : Can not convert struct<> to

19 matches

Site Navigation

Mail list logo

Footer information