create iceberg on minio s3 got "The AWS Access Key Id you provided does not exist in our records."

Lian Jiang Fri, 13 Aug 2021 15:50:11 -0700

Hi,

I try to create an iceberg table on minio s3 and hive.


*This is how I launch spark-shell:*

# add Iceberg dependency
export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=minio
export AWS_SECRET_ACCESS_KEY=minio123

ICEBERG_VERSION=0.11.1
DEPENDENCIES="org.apache.iceberg:iceberg-spark3-runtime:$ICEBERG_VERSION"

MINIOSERVER=192.168.160.5


# add AWS dependnecy
AWS_SDK_VERSION=2.15.40
AWS_MAVEN_GROUP=software.amazon.awssdk
AWS_PACKAGES=(
    "bundle"
    "url-connection-client"
)
for pkg in "${AWS_PACKAGES[@]}"; do
    DEPENDENCIES+=",$AWS_MAVEN_GROUP:$pkg:$AWS_SDK_VERSION"
done

# start Spark SQL client shell
/spark/bin/spark-shell --packages $DEPENDENCIES \
    --conf
spark.sql.catalog.hive_test=org.apache.iceberg.spark.SparkCatalog \
    --conf spark.sql.catalog.hive_test.warehouse=s3a://east/prefix \
    --conf spark.sql.catalog.hive_test.type=hive  \
    --conf
spark.sql.catalog.hive_test.io-impl=org.apache.iceberg.aws.s3.S3FileIO \
    --conf spark.hadoop.fs.s3a.endpoint=http://$MINIOSERVER:9000 \
    --conf spark.hadoop.fs.s3a.access.key=minio \
    --conf spark.hadoop.fs.s3a.secret.key=minio123 \
    --conf spark.hadoop.fs.s3a.path.style.access=true \
    --conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem

*Here is the spark code to create the iceberg table:*

import org.apache.spark.sql.SparkSession
val values = List(1,2,3,4,5)

val spark = SparkSession.builder().master("local").getOrCreate()
import spark.implicits._
val df = values.toDF()

val core = "mytable8"
val table = s"hive_test.mydb.${core}"
val s3IcePath = s"s3a://spark-test/${core}.ice"

df.writeTo(table)
    .tableProperty("write.format.default", "parquet")
    .tableProperty("location", s3IcePath)
    .createOrReplace()

I got an error "The AWS Access Key Id you provided does not exist in our
records.".

I have verified that I can login minio UI using the same username and
password that I passed to spark-shell via AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY env variables.
https://github.com/apache/iceberg/issues/2168 is related but does not help
me. Not sure why the credential does not work for iceberg + AWS. Any idea
or an example of writing an iceberg table to S3 using hive catalog will be
highly appreciated! Thanks.

create iceberg on minio s3 got "The AWS Access Key Id you provided does not exist in our records."

Reply via email to