The logic is using “org.apache.hadoop.fs.s3a.S3AFileSystem” as we can see in the stack trace. Shouldn’t this then be using S3 related configuration in HADOOP_CONF_DIR? In Hadoop’s core-site.xml, we have the S3 related configuration parameters as below:
<property> <name>fs.s3a.endpoint</name> <value>http://accumulo-minio:9000</value<http://accumulo-minio:9000%3c/value>> </property> <property> <name>fs.s3a.access.key</name> <value>YYYYYYY</value> </property> <property> <name>fs.s3a.secret.key</name> <value>XXXXXXX</value> </property> So, why do we need to create AWS credentials file? Where do we create it and what is the format? Thanks Ranga From: Christopher <ctubb...@apache.org> Date: Friday, January 20, 2023 at 12:19 PM To: accumulo-user <user@accumulo.apache.org>, Samudrala, Ranganath [USA] <samudrala_rangan...@bah.com> Subject: [External] Re: Accumulo with S3 Based on the error message, it looks like you might need to configure each of the Accumulo nodes with the AWS credentials file. On Fri, Jan 20, 2023, 11:43 Samudrala, Ranganath [USA] via user <user@accumulo.apache.org<mailto:user@accumulo.apache.org>> wrote: Hello again! Next problem I am facing is configuring Minio S3 with Accumulo. I am referring to this document: https://accumulo.apache.org/blog/2019/09/10/accumulo-S3-notes.html<https://urldefense.com/v3/__https:/accumulo.apache.org/blog/2019/09/10/accumulo-S3-notes.html__;!!May37g!Npq_ufPiXLCon5b0bmFXpSF0_wq62PJ3jqlGWCzr2IrUJ7AuKJy9nTyTWBQEHcDq856CEyCDZlbZctvxmZ5CmUU$> I have already invoked the command “accumulo init” with and without the option “–upload-accumulo-props” and using accumulo.properties as below: instance.volumes= hdfs://accumulo-hdfs-namenode-0.accumulo-hdfs-namenodes:8020/accumulo instance.zookeeper.host=accumulo-zookeeper general.volume.chooser=org.apache.accumulo.core.spi.fs.PreferredVolumeChooser general.custom.volume.preferred.logger=hdfs://accumulo-hdfs-namenode-0.accumulo-hdfs-namenodes:8020/accumulo general.custom.volume.preferred.default= hdfs://accumulo-hdfs-namenode-0.accumulo-hdfs-namenodes:8020/accumulo Next, when I run the command “accumulo init –add-volumes” with accumulo.properties is as below: instance.volumes=s3a://minio-s3/accumulo,hdfs://accumulo-hdfs-namenode-0.accumulo-hdfs-namenodes:8020/accumulo instance.zookeeper.host=accumulo-zookeeper general.volume.chooser=org.apache.accumulo.core.spi.fs.PreferredVolumeChooser general.custom.volume.preferred.logger=hdfs://accumulo-hdfs-namenode-0.accumulo-hdfs-namenodes:8020/accumulo general.custom.volume.preferred.default=s3a://minio-s3/accumulo I see error as below: ERROR StatusLogger An exception occurred processing Appender MonitorLog java.lang.RuntimeException: Can't tell if Accumulo is initialized; can't read instance id at s3a://minio-s3/accumulo/instance_id at org.apache.accumulo.server.fs.VolumeManager.getInstanceIDFromHdfs(VolumeManager.java:229) at org.apache.accumulo.server.ServerInfo.<init>(ServerInfo.java:102) at org.apache.accumulo.server.ServerContext.<init>(ServerContext.java:106) at org.apache.accumulo.monitor.util.logging.AccumuloMonitorAppender.lambda$new$1(AccumuloMonitorAppender.java:93) at org.apache.accumulo.monitor.util.logging.AccumuloMonitorAppender.append(AccumuloMonitorAppender.java:111) at org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:161) at org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:134) at org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:125) at org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:89) at org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:683) at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:641) at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:624) at org.apache.logging.log4j.core.config.LoggerConfig.logParent(LoggerConfig.java:674) at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:643) at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:624) at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:612) at org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:98) at org.apache.logging.log4j.core.async.AsyncLogger.actualAsyncLog(AsyncLogger.java:488) at org.apache.logging.log4j.core.async.RingBufferLogEvent.execute(RingBufferLogEvent.java:156) at org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:51) at org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:29) at com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:168) at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:125) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: java.nio.file.AccessDeniedException: s3a://minio-s3/accumulo/instance_id: listStatus on s3a://minio-s3/accumulo/instance_id: com.amazonaws.services.s3.model.AmazonS3Exception: The AWS Access Key Id you provided does not exist in our records. (Service: Amazon S3; Status Code: 403; Error Code: InvalidAccessKeyId; Request ID: 71HC1ZM3D43W0H67; S3 Extended Request ID: OsRVgg057cm+M7EP+P069hY97mA6na8rkhnNVunVRTUmttCDc5Sm5aKqodS+oogU5/UupgsEy1A=; Proxy: null), S3 Extended Request ID: OsRVgg057cm+M7EP+P069hY97mA6na8rkhnNVunVRTUmttCDc5Sm5aKqodS+oogU5/UupgsEy1A=:InvalidAccessKeyId at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:255) at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:119) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$21(S3AFileSystem.java:3263) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:444) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2337) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2356) at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:3262) at org.apache.accumulo.server.fs.VolumeManager.getInstanceIDFromHdfs(VolumeManager.java:211) ... 23 more Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The AWS Access Key Id you provided does not exist in our records. (Service: Amazon S3; Status Code: 403; Error Code: InvalidAccessKeyId; Request ID: 71HC1ZM3D43W0H67; S3 Extended Request ID: OsRVgg057cm+M7EP+P069hY97mA6na8rkhnNVunVRTUmttCDc5Sm5aKqodS+oogU5/UupgsEy1A=; Proxy: null), S3 Extended Request ID: OsRVgg057cm+M7EP+P069hY97mA6na8rkhnNVunVRTUmttCDc5Sm5aKqodS+oogU5/UupgsEy1A= at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1879) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1418) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1387) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715) at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5397) at com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:971) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$11(S3AFileSystem.java:2595) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499) at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:414) at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:377) at org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:2586) at org.apache.hadoop.fs.s3a.S3AFileSystem$ListingOperationCallbacksImpl.lambda$listObjectsAsync$0(S3AFileSystem.java:2153) at org.apache.hadoop.fs.s3a.impl.CallableSupplier.get(CallableSupplier.java:87) at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ... 1 more When I invoke commands from HDFS, I see no problems though: * hdfs dfs -fs s3a://minio-s3 -ls / 2023-01-20 16:38:51,319 DEBUG [s3a-transfer-minio-s3-unbounded-pool2-t1] [com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser]: Sanitizing XML document destined for handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListObjectsV2Handler 2023-01-20 16:38:51,321 DEBUG [s3a-transfer-minio-s3-unbounded-pool2-t1] [com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager]: Connection [id: 0][route: {}->http://accumulo-minio:9000<https://urldefense.com/v3/__http:/accumulo-minio:9000__;!!May37g!Npq_ufPiXLCon5b0bmFXpSF0_wq62PJ3jqlGWCzr2IrUJ7AuKJy9nTyTWBQEHcDq856CEyCDZlbZctvxtBLkmDw$>] can be kept alive for 60.0 seconds 2023-01-20 16:38:51,321 DEBUG [s3a-transfer-minio-s3-unbounded-pool2-t1] [com.amazonaws.thirdparty.apache.http.impl.conn.DefaultManagedHttpClientConnection]: http-outgoing-0: set socket timeout to 0 2023-01-20 16:38:51,321 DEBUG [s3a-transfer-minio-s3-unbounded-pool2-t1] [com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager]: Connection released: [id: 0][route: {}->http://accumulo-minio:9000][total<https://urldefense.com/v3/__http:/accumulo-minio:9000**Atotal__;XVs!!May37g!Npq_ufPiXLCon5b0bmFXpSF0_wq62PJ3jqlGWCzr2IrUJ7AuKJy9nTyTWBQEHcDq856CEyCDZlbZctvxg0_Hq0M$> available: 1; route allocated: 1 of 128; total allocated: 1 of 128] 2023-01-20 16:38:51,321 DEBUG [s3a-transfer-minio-s3-unbounded-pool2-t1] [com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser]: Parsing XML response document with handler: class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListObjectsV2Handler 2023-01-20 16:38:51,328 DEBUG [s3a-transfer-minio-s3-unbounded-pool2-t1] [com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser]: Examining listing for bucket: minio-s3 2023-01-20 16:38:51,329 DEBUG [s3a-transfer-minio-s3-unbounded-pool2-t1] [com.amazonaws.request]: Received successful response: 200, AWS Request ID: 173C11CC6FEF29A0 2023-01-20 16:38:51,329 DEBUG [s3a-transfer-minio-s3-unbounded-pool2-t1] [com.amazonaws.requestId]: x-amzn-RequestId: not available 2023-01-20 16:38:51,329 DEBUG [s3a-transfer-minio-s3-unbounded-pool2-t1] [com.amazonaws.requestId]: AWS Request ID: 173C11CC6FEF29A0 2023-01-20 16:38:51,338 DEBUG [s3a-transfer-minio-s3-unbounded-pool2-t1] [com.amazonaws.latency]: ServiceName=[Amazon S3], StatusCode=[200], ServiceEndpoint=[http://accumulo-minio:9000<https://urldefense.com/v3/__http:/accumulo-minio:9000__;!!May37g!Npq_ufPiXLCon5b0bmFXpSF0_wq62PJ3jqlGWCzr2IrUJ7AuKJy9nTyTWBQEHcDq856CEyCDZlbZctvxtBLkmDw$>], RequestType=[ListObjectsV2Request], AWSRequestID=[173C11CC6FEF29A0], HttpClientPoolPendingCount=0, RetryCapacityConsumed=0, HttpClientPoolAvailableCount=0, RequestCount=1, HttpClientPoolLeasedCount=0, ResponseProcessingTime=[71.198], ClientExecuteTime=[297.496], HttpClientSendRequestTime=[7.255], HttpRequestTime=[119.87], ApiCallLatency=[279.779], RequestSigningTime=[56.006], CredentialsRequestTime=[5.091, 0.015], HttpClientReceiveResponseTime=[12.849]