Right, your CREATE TABLE statement now points to your S3 location, so you
don't need to do anything else. However, queries will pull this data from
S3 every time, which will be a little slower and you'll incur a small
charge for reading from S3. However, parking data there is great when you
only ne
Chunky,
You have an external table that points at the location s3://location/
No need to load the data. All files (or partitions folders) under
s3://location/ should be available via the table.
Just run your queries on it.
Load data will move the data from one HDFS location to another. You don't
Hi,
Now when I am trying to load a csv file to any table I created, its not
working.
I created a table :-
CREATE EXTERNAL TABLE someidtable (
someid STRING,
)
ROW FORMAT
DELIMITED FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
LOCATION 's3://location/';
Then
LOAD DATA INPATH 's3://location/
Okay Mark, I will be looking into this JIRA regularly.
Thanks again for helping.
Chunky.
On Wed, Nov 7, 2012 at 12:22 PM, Mark Grover wrote:
> Chunky,
> I just tried it myself. It turns out that the directory you are adding as
> partition has to be empty for msck repair to work. This is obviously
Chunky,
I just tried it myself. It turns out that the directory you are adding as
partition has to be empty for msck repair to work. This is obviously
sub-optimal and there is a JIRA in place (
https://issues.apache.org/jira/browse/HIVE-3231) to fix it.
So, I'd suggest you keep an eye out for the
Hi Mark,
Sorry, I forgot to mention. I have also tried
msck repair table ;
and same output I got which I got from msck only.
Do I need to do any other settings for this to work, because I have
prepared Hadoop and Hive setup from start on EC2.
Thanks,
Chunky.
On Wed, Nov 7, 2012
Chunky,
You should have run:
msck repair table ;
Sorry, I should have made it clear in my last reply. I have added an entry
to Hive wiki for benefit of others:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions
Mark
On Tue, Nov 6, 2012 at 9:5
Hi Mark,
I didn't get any error.
I ran this on hive console:-
"msck table Table_Name;"
It says Ok and showed the execution time as 1.050 sec.
But when I checked partitions for table using
"show partitions Table_Name;"
It didn't show me any partitions.
Thanks,
Chunky.
On Tue, No
Glad to hear, Chunky.
Out of curiosity, what errors did you get when using msck?
On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta wrote:
> Hi Mark,
> I tried msck, but it is not working for me. I have written a python script
> to partition the data individually.
>
> Thank you Edward, Mark and Dean.
Hi Mark,
I tried msck, but it is not working for me. I have written a python script
to partition the data individually.
Thank you Edward, Mark and Dean.
Chunky.
On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover wrote:
> Chunky,
> I have used "recover partitions" command on EMR, and that worked fine.
Chunky,
I have used "recover partitions" command on EMR, and that worked fine.
However, take a look at https://issues.apache.org/jira/browse/HIVE-874. Seems
like msck command in Apache Hive does the same thing. Try it out and let us
know it goes.
Mark
On Mon, Nov 5, 2012 at 7:56 AM, Edward Capri
Recover partitions should work the same way for different file systems.
Edward
On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
wrote:
> Writing a script to add the external partitions individually is the only way
> I know of.
>
> Sent from my rotary phone.
>
>
> On Nov 5, 2012, at 8:19 AM, Chunky G
Writing a script to add the external partitions individually is the only way I
know of.
Sent from my rotary phone.
On Nov 5, 2012, at 8:19 AM, Chunky Gupta wrote:
> Hi Dean,
>
> Actually I was having Hadoop and Hive cluster on EMR and I have S3 storage
> containing logs which updates dail
Hi Dean,
Actually I was having Hadoop and Hive cluster on EMR and I have S3 storage
containing logs which updates daily and having partition with date(dt). And
I was using this recover partition.
Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. So,
what is the alternate of usi
The RECOVER PARTITIONS is an enhancement added by Amazon to their version
of Hive.
http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html
Chapter 21 of Programming Hive discusses this feature and other aspects
of using Hive in EMR.
dean
On
Hi,
I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive
version 0.8.1 (I configured everything) . I have created a table using :-
CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT DELIMITED
FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/';
Now I am
16 matches
Mail list logo