Thanks Richin and Pedro,
So a final clarification
/Another way of doing apart from dynamic partition is if you can create
your directories like below either manually or the ETL process you might
be doing to get the table data it is pretty easy./
/|* s3://
ravi/logs/adv_id=123/date=2
Hi,
I have setup a Hadoop cluster on Amazon EC2 with my data stored on S3. I would
like to use Hive to process the data on S3.
I created an external table in hive using the following:
CREATE EXTERNAL TABLE mytable1
(
HIT_TIME_GMTstring,
SERVICE string
) ROW FORMAT
Directly accessing the metastore schema is generally not a good idea.
Instead I recommend using the ALTER TABLE SET LOCATION command:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable%2FPartitionLocation
Thanks.
Carl
On Fri, Aug 24, 2012 at 10:56 A
My two cents,
Try checking if there is a skew in the input to that reducer compared to
other reducers. This happens sometimes in joins where some reducers have
large amount of input data and keep running forever.
On Fri, Aug 24, 2012 at 11:41 PM, Bertrand Dechoux wrote:
> It is not clear from y
It is not clear from your post but your job is always failing during the
same step? Or only sometimes? Or only once?
Since it's a hive query I would modify it to find the root cause.
First create temporary "files" which are the results from the three first
M/R.
Then run the fourth M/R on it and tr
Bejoy,
Thank you for your help.
updated metastore n its working fine.
Regards
-Alok
On Fri, Aug 24, 2012 at 5:40 PM, Bejoy KS wrote:
> Yes you need to update the metastore db directly for this to be in effect.
>
> Regards
> Bejoy KS
>
> Sent from handheld, please excuse typos.
>
> -Original
Why don't you try splitting the big query into smaller ones?
On Fri, Aug 24, 2012 at 10:20 AM, Tim Havens wrote:
>
> Just curious if you've tried using Hive's explain method to see what IT
> thinks of your query.
>
>
> On Fri, Aug 24, 2012 at 9:36 AM, Himanish Kushary wrote:
>
>> Hi,
>>
>> We h
Just curious if you've tried using Hive's explain method to see what IT
thinks of your query.
On Fri, Aug 24, 2012 at 9:36 AM, Himanish Kushary wrote:
> Hi,
>
> We have a complex query that involves several left outer joins resulting
> in 8 M/R jobs in Hive.During execution of one of the stages
Hi,
We have a complex query that involves several left outer joins resulting in
8 M/R jobs in Hive.During execution of one of the stages ( after three M/R
has run) the M/R job fails due to few Reduce tasks failing due to
inactivity.
Most of the reduce tasks go through fine ( within 3 mins) but th
Hi Ravi,
Another way of doing apart from dynamic partition is if you can create your
directories like below either manually or the ETL process you might be doing to
get the table data it is pretty easy.
s3://ravi/logs/adv_id=123/date=2012-01-01/log.gz
s3://ravi/logs/adv_id=456/date=2012-01-02/
Hi,
On 24 Aug 2012, at 14:08, Ravi Shetye wrote:
>
> Is this all I need to do to load the data?
> how will the system know what data will go into what partition?
> As I understand the partition columns should be psedo columns and not part of
> the actual data.
Sorry, I just copy&pasted your ta
|
thanks for the reply
Let concentrate on the second case*
*|*|CREATE EXTERNAL TABLE results (cookie STRING,
d2 STRING,
url STRING,
d4 STRING,
d5 STRING,
d6 STRING,
adv_id_dummy STRING,
timestp STRING,
ip STRING,
userAgent STRING,
stage STRING,
d12 STRING,
d13 STRING)
PAR
Hi,
On 24 Aug 2012, at 13:26, Ravi Shetye wrote:
> I have the data in s3 bucket in the following manner
>
> s3://logs/ad1date1.log.gz
> s3://logs/ad1date2.log.gz
> s3://logs/ad1date3.log.gz
> s3://logs/ad1date4.log.gz
> s3://logs/ad2date1.log.gz
> s3://logs/ad2date2.log.gz
> s3://logs/ad2date3.l
I have the data in s3 bucket in the following manner
|s3://logs/ad1date1.log.gz
s3://logs/ad1date2.log.gz
s3://logs/ad1date3.log.gz
s3://logs/ad1date4.log.gz
s3://logs/ad2date1.log.gz
s3://logs/ad2date2.log.gz
s3://logs/ad2date3.log.gz
s3://logs/ad2date4.log.gz
|
I have to load some of them into
Yes you need to update the metastore db directly for this to be in effect.
Regards
Bejoy KS
Sent from handheld, please excuse typos.
-Original Message-
From: Alok Kumar
Date: Fri, 24 Aug 2012 13:30:36
To: ;
Reply-To: user@hive.apache.org
Subject: alter external table location with new
Hello,
We have hive external table mapped to hbase, now moving
from pseudo distributed to fully distributed hadoop cluster.
found that hive queries are still pointing to older namenode address
ie: hdfs://localhost:9000/user/hive/warehouse/ as it stores
full uri in its derby metastore.
Q . what w
16 matches
Mail list logo