Hello. AwasthiSomesh
Since I'm not quite sure exactly what problem you're having, I'd like to 
confirm one thing first.
I assume you use HIVE4.0.0+HADOOP3.3.6+TEZ0.10.3.
Can you test reading and writing ICEBERG table on hdfs without s3 first?
If you can read and write to the iceberg table on HDFS, then we'll discuss 
reading and writing to the s3 file next.
If you also can't read or write to the iceberg table stored on HDFS, we need to 
analyse the problem further.
Tks.
LiSoDa.











在 2024-09-20 17:01:54,"Awasthi, Somesh" <soawas...@informatica.com.INVALID> 写道:

Hi Raghav,

 

How to find active name node I tried with below command its retiring

 

 

Thanks,

Somesh

 

From: Awasthi, Somesh
Sent: Friday, September 20, 2024 10:29 AM
To: u...@hive.apache.org; dev@hive.apache.org
Cc: Ayush Saxena <ayush...@gmail.com>; d...@iceberg.apache.org
Subject: RE: Hive 4 integration to store table on S3 and ADLS gen2

 

Thanks, Raghav, for the detailed explanation.

 

I will go through your details and follow the instructions. Below is my few 
findings.

 

I have done docker images setup for hive 4.0.0 and able store table on S3 but 
while inserting its failing then we change execution engine to mr .

 

we set set hive.execution.engine=mr for insert working but with mr we are not 
able to read any single table with hive 4.0.4. alph2

https://github.com/apache/iceberg/issues/11168

 

I see that you also faced the similar problem 
-https://github.com/apache/iceberg/issues/7924 how did you resolve this issue.

 

Thanks a lot for you effort and help.

 

with tez we are facing below error while inserting records

Error:

.6.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$exists$34(S3AFileSystem.java:4636)
 ~[hadoop-aws-3.3.6.jar:?]
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
 ~[hadoop-common-3.3.6.jar:?]
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
 ~[hadoop-common-3.3.6.jar:?]
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
 ~[hadoop-common-3.3.6.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2480)
 ~[hadoop-aws-3.3.6.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2499)
 ~[hadoop-aws-3.3.6.jar:?]
at org.apache.hadoop.fs.s3a.S3AFileSystem.exists(S3AFileSystem.java:4634) 
~[hadoop-aws-3.3.6.jar:?]
at 
org.apache.tez.common.TezCommonUtils.getTezBaseStagingPath(TezCommonUtils.java:91)
 ~[tez-api-0.10.3.jar:0.10.3]
at 
org.apache.tez.common.TezCommonUtils.getTezSystemStagingPath(TezCommonUtils.java:149)
 ~[tez-api-0.10.3.jar:0.10.3]
at org.apache.tez.dag.app.DAGAppMaster.serviceInit(DAGAppMaster.java:492) 
~[tez-dag-0.10.3.jar:0.10.3]
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) 
~[hadoop-common-3.3.6.jar:?]
at org.apache.tez.dag.app.DAGAppMaster$9.run(DAGAppMaster.java:2644) 
~[tez-dag-0.10.3.jar:0.10.3]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_342]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_342]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
 ~[hadoop-common-3.3.6.jar:?]
at 
org.apache.tez.dag.app.DAGAppMaster.initAndStartAppMaster(DAGAppMaster.java:2641)
 ~[tez-dag-0.10.3.jar:0.10.3]
at org.apache.tez.client.LocalClient$1.run(LocalClient.java:361) 
~[tez-dag-0.10.3.jar:0.10.3]
... 1 more
ERROR : FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.tez.TezTask. java.io.IOException: 
org.apache.tez.dag.api.TezUncheckedException: 
java.nio.file.AccessDeniedException: 
s3a://com.anush/opt/hive/scratch_dir/hive/_tez_session_dir/0c1896fa-2b9d-4461-9ab4-ced0fd46ef48:
 org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials 
provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider 
EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : 
com.amazonaws.SdkClientException: Unable to load AWS credentials from 
environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY 
(or AWS_SECRET_ACCESS_KEY))
INFO : Completed executing 
command(queryId=hive_20240919065346_a71fd349-e14c-4bfa-9fb7-0b1b396565e3); Time 
taken: 44.607 seconds
Error: Error while compiling statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.tez.TezTask. java.io.IOException: 
org.apache.tez.dag.api.TezUncheckedException: 
java.nio.file.AccessDeniedException: 
s3a://com.anush/opt/hive/scratch_dir/hive/_tez_session_dir/0c1896fa-2b9d-4461-9ab4-ced0fd46ef48:
 org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials 
provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider 
EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : 
com.amazonaws.SdkClientException: Unable to load AWS credentials from 
environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY 
(or AWS_SECRET_ACCESS_KEY)) (state=08S01,code=1)
0: jdbc:hive2://localhost:10000/>

 

 

Thanks,

Somesh

 

From: Raghav Aggarwal <raghavaggarwal03...@gmail.com>
Sent: Thursday, September 19, 2024 10:41 PM
To:dev@hive.apache.org
Cc: Ayush Saxena <ayush...@gmail.com>; u...@hive.apache.org; 
d...@iceberg.apache.org; Awasthi, Somesh <soawas...@informatica.com>
Subject: Re: Hive 4 integration to store table on S3 and ADLS gen2

 

| |

Caution: This email originated from outside of the organization. Review for 
Phishing!

|

 

Hi Somesh,

 

I have worked previously with Hive on S3 (for reviewing HIVE-28272) but haven't 
explicitly tried with iceberg table on S3. This might help you. These are the 
S3 related configs that you will need in your setup.

 

|

Place

|

Property in name

|

Property Value

|
|

core-site.xml

|

fs.s3a.ssl.channel.mode

|

default_jsse_with_gcm

|
|

core-site.xml

|

fs.s3a.connection.ssl.enabled

|

true

|
|

core-site.xml

|

fs.s3a.path.style.access

|

true

|
|

core-site.xml

|

fs.s3a.endpoint

|

https://hostname:port

|
|

core-site.xml

|

fs.s3a.access.key

|

access key cred

|
|

core-site.xml

|

fs.s3a.secret.key

|

secret key cred

|
|

hive-env.sh

|

add this variable in hive-env.

|

export HIVE_AUX_JARS_PATH="path th hadoop aws jar: path to aws-sdk-bundle jar"

|

 

AWS-SDK-Bundle version used: 1.12.367.jar

 

For Hive-4.0.0, Hadoop-3.3.6 and Tez-0.10.3, standalone setup I have a small 
github repo which contains the configs file for each component, you can refer 
to that and make the changes accordingly. Project URL: 
https://github.com/Aggarwal-Raghav/local-setup-conf

 

I am attaching a doc which I wrote based on my understanding of HIVE ON S3 (Doc 
is present in mentioned hive JIRA PR as well), might come in handy :-)

--

Thanks,

Raghav Aggarwal

 

On Thu, Sep 19, 2024 at 10:16 AM Awasthi, Somesh 
<soawas...@informatica.com.invalid> wrote:

Anyone can help here what is wrong with setup

 

From: Awasthi, Somesh
Sent: Wednesday, September 18, 2024 1:34 PM
To: Ayush Saxena <ayush...@gmail.com>; dev@hive.apache.org
Cc:u...@hive.apache.org; d...@iceberg.apache.org
Subject: RE: Hive 4 integration to store table on S3 and ADLS gen2

 

Any idea plz suggest.

 

From: Awasthi, Somesh
Sent: Wednesday, September 18, 2024 11:52 AM
To: Ayush Saxena <ayush...@gmail.com>; dev@hive.apache.org
Cc:u...@hive.apache.org; d...@iceberg.apache.org
Subject: RE: Hive 4 integration to store table on S3 and ADLS gen2

 

Hi Aayush thanks for your quick response .

 

Hadoop 3.3.6 is correct what is the wrong here .

 

How to raise bug for Hadoop could you please help here once

 

One more question.

 

How to setup hive 4 standalone with iceberg support with table stored on s3.

 

Please give me proper steps and doc to help it seamlessly

 

Thanks for u support.

 

Thanks,

Somesh

 

From: Ayush Saxena <ayush...@gmail.com>
Sent: Wednesday, September 18, 2024 11:41 AM
To:dev@hive.apache.org
Cc:u...@hive.apache.org; d...@iceberg.apache.org; Awasthi, Somesh 
<soawas...@informatica.com>
Subject: Re: Hive 4 integration to store table on S3 and ADLS gen2

 

| |

Caution: This email originated from outside of the organization. Review for 
Phishing!

|

 

Hi Somesh,

 

But while trying so we are seeing following exception :
hadoop fs -ls s3a://somesh.qa.bucket/ -:

 

This has nothing to do with Hive as such, You have configured Hadoop S3 client 
wrong, you are missing configs, your hadoop ls command itself is failing, there 
is no Hive involved here. You need to setup the FileSystem correctly...

 

This is a hadoop problem, maybe you can explore reading this doc in hadoop [1] 
& that might help, if you still face issues, you should bug the Hadoop mailing 
lists not hive

 

-Ayush

 

[1] https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html

 

On Wed, 18 Sept 2024 at 11:12, Awasthi, Somesh 
<soawas...@informatica.com.invalid> wrote:

Hi Team,

 

I want to setup hive4 standalone to store table on S3 and Adls gen2 as a 
storage .

 

Could you please help me as with proper steps and configurations required for 
this.

 

Because we are facing multiple issue on this please help me here ASPA.

 

What we tried.

 

I am trying to configure AWS S3 configuration with the Hadoop and Hive setup.

But while trying so we are seeing following exception :

hadoop fs -ls s3a://somesh.qa.bucket/ -:

Fatal internal error java.lang.RuntimeException: 
java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem 
not found

To resolve this I have added hadoop-aws-3.3.6.jar and 
aws-java-sdk-bundle-1.12.770.jar in Hadoop classpath.

i.e is under : /usr/local/hadoop/share/hadoop/common/lib

And S3 related configurations in the core-site.xml file: under 
/usr/local/hadoop/etc/hadoop directory.

fs.default.name s3a://somesh.qa.bucket fs.s3a.impl 
org.apache.hadoop.fs.s3a.S3AFileSystem fs.s3a.endpoint 
s3.us-west-2.amazonaws.com fs.s3a.access.key {Access _Key_Value} 
fs.s3a.secret.key {Secret_Key_Value} fs.s3a.path.style.access false

Now when we try hadoop fs -ls s3a://somesh.qa.bucket/

We are observing following exception :

2024-08-22 13:50:11,294 INFO impl.MetricsConfig: Loaded properties from 
hadoop-metrics2.properties
2024-08-22 13:50:11,376 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot 
period at 10 second(s).
2024-08-22 13:50:11,376 INFO impl.MetricsSystemImpl: s3a-file-system metrics 
system started
2024-08-22 13:50:11,434 WARN util.VersionInfoUtils: The AWS SDK for Java 1.x 
entered maintenance mode starting July 31, 2024 and will reach end of support 
on December 31, 2025. For more information, see 
https://aws.amazon.com/blogs/developer/the-aws-sdk-for-java-1-x-is-in-maintenance-mode-effective-july-31-2024/
You can print where on the file system the AWS SDK for Java 1.x core runtime is 
located by setting the AWS_JAVA_V1_PRINT_LOCATION environment variable or 
aws.java.v1.printLocation system property to 'true'.
This message can be disabled by setting the 
AWS_JAVA_V1_DISABLE_DEPRECATION_ANNOUNCEMENT environment variable or 
aws.java.v1.disableDeprecationAnnouncement system property to 'true'.
The AWS SDK for Java 1.x is being used here:
at java.lang.Thread.getStackTrace(Thread.java:1564)
at 
com.amazonaws.util.VersionInfoUtils.printDeprecationAnnouncement(VersionInfoUtils.java:81)
at com.amazonaws.util.VersionInfoUtils.(VersionInfoUtils.java:59)
at com.amazonaws.internal.EC2ResourceFetcher.(EC2ResourceFetcher.java:44)
at 
com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.(InstanceMetadataServiceCredentialsFetcher.java:38)
at 
com.amazonaws.auth.InstanceProfileCredentialsProvider.(InstanceProfileCredentialsProvider.java:111)
at 
com.amazonaws.auth.InstanceProfileCredentialsProvider.(InstanceProfileCredentialsProvider.java:91)
at 
com.amazonaws.auth.InstanceProfileCredentialsProvider.(InstanceProfileCredentialsProvider.java:75)
at 
com.amazonaws.auth.InstanceProfileCredentialsProvider.(InstanceProfileCredentialsProvider.java:58)
at 
com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper.initializeProvider(EC2ContainerCredentialsProviderWrapper.java:66)
at 
com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper.(EC2ContainerCredentialsProviderWrapper.java:55)
at 
org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider.(IAMInstanceCredentialsProvider.java:53)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.fs.s3a.S3AUtils.createAWSCredentialProvider(S3AUtils.java:727)
at org.apache.hadoop.fs.s3a.S3AUtils.buildAWSProviderList(S3AUtils.java:659)
at 
org.apache.hadoop.fs.s3a.S3AUtils.createAWSCredentialProviderSet(S3AUtils.java:585)
at org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:959)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:586)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3611)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3712)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3663)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:557)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:347)
at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:264)
at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:247)
at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:105)
at org.apache.hadoop.fs.shell.Command.run(Command.java:191)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:327)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:97)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:390)
ls: s3a://infa.qa.bucket/: 
org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials 
provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider 
EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : 
com.amazonaws.SdkClientException: Unable to load AWS credentials from 
environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY 
(or AWS_SECRET_ACCESS_KEY))
2024-08-22 13:50:14,248 INFO impl.MetricsSystemImpl: Stopping s3a-file-system 
metrics system...
2024-08-22 13:50:14,248 INFO impl.MetricsSystemImpl: s3a-file-system metrics 
system stopped.
2024-08-22 13:50:14,248 INFO impl.MetricsSystemImpl: s3a-file-system metrics 
syst

 

 

Could you please help us to resolve this issue as soon as possible

 

Thanks,

Somesh

 




 

--

Thanks,

Raghav Aggarwal

Reply via email to