Which distribution are you using? Do you have hadoop-aws on the class path? Is ‘/path/to/hadoop/install’ a literal value or a placeholder that you’ using for the actual location?
Cheers, Elliot. On Sat, 9 Dec 2017 at 00:08, Scott Halgrim <scott.halg...@zapier.com> wrote: > Hi, > > I’ve been struggling with this for a few hours, hopefully somebody here > can help me out. > > We have a lot of data in parquet format on S3 and we want to use Hive to > query it. I’m running on ubuntu and we have a MySQL metadata store on AWS > RDS. > > The command in the hive client I’m trying to run is: > > CREATE EXTERNAL TABLE > my_schema.my_table > (account_id INT, > action VARCHAR(282), > another_id INT > yaid INT, > `date` TIMESTAMP, > deleted_at TIMESTAMP, > id INT, > lastchanged TIMESTAMP, > thing_index DOUBLE, > old_id INT, > parent_id INT, > running INT, > other_id INT, > another_thing VARCHAR(282), > title VARCHAR(282), > type_of VARCHAR(282)) > PARTITIONED BY (snapshot_date DATE) > STORED AS parquet > LOCATION 's3a://bucket/folder/foo_my_schema.my_table’; > > > The error I get is: > > FAILED: SemanticException java.lang.RuntimeException: > java.lang.ClassNotFoundException: Class > org.apache.hadoop.fs.s3a.S3AFileSystem not found > > > I have this in my hive-site.xml file: > > <property> > <name>hive.aux.jars.path</name> > <value>/path/to/hadoop-install/hadoop-2.9.0/share/hadoop/tools/lib</value> > </property> > > > > Another thing I tried was to create the external table without a location > and then alter it to have the location: > > alter table my_schema.my_table set location > "s3a://bucket/folder/foo_my_schema.my_table"; > > > And then I get a different error: > > FAILED: SemanticException Cannot connect to namenode, please check if > host/port pair for s3a://bucket/folder/foo_my_schema.my_table is valid > > > What could I be missing? > > Thanks! > > Scott >