Re: load data from s3 to hive

Florin Diaconeasa Wed, 23 Nov 2011 06:37:23 -0800

Hello,

1st of all hadoop needs to use S3 as primary file system. So inside hadoop
configuration core-site.xml you need to set fs.default.name with a value of
the following form: s3n://your-bucket-name


After this, the way i've done it in hive 0.6 and i assume it still
works: alter table my_table add partition (p1=a,p2=b) location
"s3n://your-bucket-name/path-to-folder-for-partition"

This worked for me without any issues. I assume the other way you provided
should work as well, but probably there is an issue with the evaluation of
the query...

Florin



On 11 November 2011 22:11, jiang licht <licht_ji...@yahoo.com> wrote:

> Check if this link provides any help:
>
> http://aws.amazon.com/elasticmapreduce/faqs/#hive-2
>
> read " Are there new features in Hive specific to Amazon Elastic
> MapReduce?"
>
> and
>
> http://aws.amazon.com/articles/2856
>
> Best regards,
> Michael
> ------------------------------
> *From:* Raimon Bosch <raimon.bo...@gmail.com>
> *To:* user@hive.apache.org
> *Sent:* Friday, November 11, 2011 10:40 AM
> *Subject:* load data from s3 to hive
>
>
> Hi,
>
> I have read that hadoop supports native operations over S3 Filesystem so
> you're able to perform operations like:
>
> hadoop fs -ls s3n://mybucket/my_folder/
>
> or:
>
> hadoop fs -copy s3n://mybucket/my_folder /tmp/my_folder
>
> I'm wondering why hive is not able to perform similar operations. It would
> be a very good feature load data directly from S3 to Hive, something like:
>
> hive -e "LOAD DATA LOCAL INPATH 's3n://mybucket/my_hive_table' INTO
> TABLE my_hive_table PARTITION(dt='2011-11-11');"
>
> Right now this is not possible. What do you think? Which classes should be
> changed?
>
>
>


-- 


Florin

Re: load data from s3 to hive

Reply via email to