Thanks, Elliott.

I’m using Ubuntu 14.04.5 LTS, Hadoop 2.9.0, and Hive 2.3.2.

I hadn’t had a CLASSPATH environment variable set, but I did just set it to 
have that jar on it and I get the same answer.

I don’t really have the installation in “/path/to/…” but everything else in the 
path is correct. You’re right, I’m just obfuscating the actual value.

A new odd behavior that didn’t happen yesterday, is that in hive, ‘SELECT * 
FROM TBLS;’ returns an error: ‘Table not found: TBLS’. Yet, I can connect to 
the MySQL backend and query it fine. In hive I can also query tables I know are 
there…just not TBLS.

Hope that makes sense,

Thanks again!

Scott

On Dec 9, 2017, 9:46 AM -0800, Elliot West <tea...@gmail.com>, wrote:
> Which distribution are you using? Do you have hadoop-aws on the class path? 
> Is ‘/path/to/hadoop/install’ a literal value or a placeholder that you’ using 
> for the actual location?
>
> Cheers,
>
> Elliot.
>
> > On Sat, 9 Dec 2017 at 00:08, Scott Halgrim <scott.halg...@zapier.com> wrote:
> > > Hi,
> > >
> > > I’ve been struggling with this for a few hours, hopefully somebody here 
> > > can help me out.
> > >
> > > We have a lot of data in parquet format on S3 and we want to use Hive to 
> > > query it. I’m running on ubuntu and we have a MySQL metadata store on AWS 
> > > RDS.
> > >
> > > The command in the hive client I’m trying to run is:
> > > > CREATE EXTERNAL TABLE
> > > > my_schema.my_table
> > > > (account_id INT,
> > > > action VARCHAR(282),
> > > > another_id INT
> > > > yaid INT,
> > > > `date` TIMESTAMP,
> > > > deleted_at TIMESTAMP,
> > > > id INT,
> > > > lastchanged TIMESTAMP,
> > > > thing_index DOUBLE,
> > > > old_id INT,
> > > > parent_id INT,
> > > > running INT,
> > > > other_id INT,
> > > > another_thing VARCHAR(282),
> > > > title VARCHAR(282),
> > > > type_of VARCHAR(282))
> > > > PARTITIONED BY (snapshot_date DATE)
> > > > STORED AS parquet
> > > > LOCATION 's3a://bucket/folder/foo_my_schema.my_table’;
> > >
> > > The error I get is:
> > > > FAILED: SemanticException java.lang.RuntimeException: 
> > > > java.lang.ClassNotFoundException: Class 
> > > > org.apache.hadoop.fs.s3a.S3AFileSystem not found
> > >
> > > I have this in my hive-site.xml file:
> > > > <property>
> > > > <name>hive.aux.jars.path</name>
> > > > <value>/path/to/hadoop-install/hadoop-2.9.0/share/hadoop/tools/lib</value>
> > > > </property>
> > >
> > >
> > > Another thing I tried was to create the external table without a location 
> > > and then alter it to have the location:
> > >
> > > > alter table my_schema.my_table set location 
> > > > "s3a://bucket/folder/foo_my_schema.my_table";
> > >
> > > And then I get a different error:
> > >
> > > > FAILED: SemanticException Cannot connect to namenode, please check if 
> > > > host/port pair for s3a://bucket/folder/foo_my_schema.my_table is valid
> > >
> > > What could I be missing?
> > >
> > > Thanks!
> > >
> > > Scott

Reply via email to