Metastore integration testing with S3

Elliot West Mon, 26 Jun 2017 11:13:08 -0700

I'm trying to put together a metastore instance for the purposes of
creating an integration test environment. The system in question reads and
writes data into S3 and consequently manages Hive tables whose raw data
lives in S3. I've been successfully using Minio (https://www.minio.io) to
decouple other parts of the system from AWS so that it can stand alone, but
I've been having trouble getting this to play nicely with the metastore.


As I understand it I can switch S3 endpoints on the S3AFileSystem and must
enable path style access. I'm using Hive 2.0.1 and Hadoop 2.8.0 deployed in
a docker container with the following additional site config:

fs.s3a.path.style.access=true
fs.s3a.endpoint=http://172.17.0.2:9000
fs.s3a.access.key=<akey>
fs.s3a.secret.key=<skey>

And environment:


com.amazonaws.services.s3.enableV4=true


However, when trying to create a table in one of my Minio buckets I see the
following:

hive> create external table x (id string) location 's3://mybucket/x/';
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got
exception: org.apache.hadoop.fs.s3a.AWSS3IOException doesBucketExist on
mybucket: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request
(Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request
ID: null), S3 Extended Request ID: null: Bad Request (Service: Amazon S3;
Status Code: 400; Error Code: 400 Bad Request; Request ID: null))


I've also tried Hive 2.1.1 with no luck. Can anyone advise?

Thanks,

Elliot.

Metastore integration testing with S3

Reply via email to