Which distribution are you using? Do you have hadoop-aws on the class path?
Is ‘/path/to/hadoop/install’ a literal value or a placeholder that you’
using for the actual location?

Cheers,

Elliot.

On Sat, 9 Dec 2017 at 00:08, Scott Halgrim <scott.halg...@zapier.com> wrote:

> Hi,
>
> I’ve been struggling with this for a few hours, hopefully somebody here
> can help me out.
>
> We have a lot of data in parquet format on S3 and we want to use Hive to
> query it. I’m running on ubuntu and we have a MySQL metadata store on AWS
> RDS.
>
> The command in the hive client I’m trying to run is:
>
> CREATE EXTERNAL TABLE
> my_schema.my_table
> (account_id INT,
> action VARCHAR(282),
> another_id INT
> yaid INT,
> `date` TIMESTAMP,
> deleted_at TIMESTAMP,
> id INT,
> lastchanged TIMESTAMP,
> thing_index DOUBLE,
> old_id INT,
> parent_id INT,
> running INT,
> other_id INT,
> another_thing VARCHAR(282),
> title VARCHAR(282),
> type_of VARCHAR(282))
> PARTITIONED BY (snapshot_date DATE)
> STORED AS parquet
> LOCATION 's3a://bucket/folder/foo_my_schema.my_table’;
>
>
> The error I get is:
>
> FAILED: SemanticException java.lang.RuntimeException:
> java.lang.ClassNotFoundException: Class
> org.apache.hadoop.fs.s3a.S3AFileSystem not found
>
>
> I have this in my hive-site.xml file:
>
> <property>
> <name>hive.aux.jars.path</name>
> <value>/path/to/hadoop-install/hadoop-2.9.0/share/hadoop/tools/lib</value>
> </property>
>
>
>
> Another thing I tried was to create the external table without a location
> and then alter it to have the location:
>
> alter table my_schema.my_table set location
> "s3a://bucket/folder/foo_my_schema.my_table";
>
>
> And then I get a different error:
>
> FAILED: SemanticException Cannot connect to namenode, please check if
> host/port pair for s3a://bucket/folder/foo_my_schema.my_table is valid
>
>
> What could I be missing?
>
> Thanks!
>
> Scott
>

Reply via email to