Thanks a lot Sunjay,
Any more thoughts on this,
Im okie with if any alternative for Merge concept in Hive
On Thu, Mar 6, 2014 at 2:02 AM, Subramanian, Sanjay (HQP) <
sanjay.subraman...@roberthalf.com> wrote:
> Hey Raj
>
> Maybe I am misunderstanding the question but u don't really have to do
Hello Hive Experts
Is there any data modeling tool that you can suggest that can work with
Hive and Postgres?
*Objective* : build & maintain entity definitions for Hive, Postgres thru
this one tool...Build logical and physical models for data warehouse in the
same tool.
Any pointers?
*thanks,
Hey Raj
Maybe I am misunderstanding the question but u don’t really have to do anything
fancy to merge
ONE TIME
CREATE EXTERNAL TABLE employee (
empnoBIGINT,
ename STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’ ;
ALTER TABLE employee SET LOCATION ‘hdfs://path/
First of , I want that cluster specs :-)
I am gong to propose a few theories without proof here
* From the info u have provided, there are no partitions….that might be an
issue creating this problem
* Is it possible to partition the data based on say days (to start with)
and run this L
Sorry, my mistake. I didn't pay attention that you are using cross join.
Yes, cross join will always use one reducer, at least that is my understand.
Yong
Date: Wed, 5 Mar 2014 15:27:48 +0100
Subject: Re: Best way to avoid cross join
From: darkwoll...@gmail.com
To: user@hive.apache.org
hey Yong,
setting number of reducers will not help normally unless there are those
many keys for reducers. even if it launches those many reducers, it may
just happen that most of them just wont get any data.
can you share how many different ids are there and whats the data sizes in
rows?
On Wed, Mar 5, 2
hey Yong,
Even without the group by (pure cross join) the query is only using one
reducer. Even specifying more reducers doesn't help:
set mapred.reduce.tasks=50;
SELECT id1,
m.keyword,
prep_kw.keyword
FROM (select id1, keyword from import1) m
CROSS JOIN
(SELECT keyword FROM et_ke
Hi, Wolli:
Cross join doesn't mean Hive has to use one reduce.
>From query point of view, the following cases will use one reducer:
1) Order by in your query (Instead of using sort by)2) Only one reducer group,
which means all the data have to send to one reducer, as there is only one
reducer gro
Hey everyone,
before i write a lot of text, i just post something which is already
written:
http://www.sqlservercentral.com/Forums/Topic1328496-360-1.aspx
The first posts adresses a pretty similar problem i also have. Currently my
implementation looks like this:
SELECT id1,
MAX(
CASE
WHE
In single node installation of Hadoop 2.2, I am trying to run Cloudera
example "Accessing Table Data with MapReduce" that copies data from one
table to another:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_19_6.html
Example code c
Hi,
Help required to merge data in hive,
Ex:
Today file
-
Empno ename
1 abc
2 def
3 ghi
Tomorrow file
-
Empno ename
5 abcd
6 defg
7 ghij
Reg: should not drop the hive
I have to create external Hive tables (and then add partitions on these
tables) on the existing data files in hdfs.
On doing it I am getting the error :
FAILED: Error in metadata:
MetaException(message:java.lang.IllegalStateException: Can't overwrite
cause)
FAILED: Execution Error, return code 1 fr
12 matches
Mail list logo