omlomloml commented on issue #5698: URL: https://github.com/apache/hudi/issues/5698#issuecomment-1147623150
after I pass the jar issue, then I hit a S3 permission issue, this is also very strange, I set the AWS credentials in the ENV, and from the logs before the failure, it looks like the hive sync tool can access S3 without problem, it can load the commit and everything, but when it goes to hive jars, looks like they can't access S3, so is there any other place I need to set AWS credentials, I tried hive_site, but it did not help. Also another question is looks like for the sync tool, even if I only want to use hms mode, I still need hive to be installed. but what we really want to do is to use the hms sync in spark, when we write to hudi tables, so if we are using spark bundle with the hms mode, do we still need hive to be installed? Thanks Heng -- here is the log Running Command : java -cp /opt/hive/lib/hive-metastore-3.1.3.jar::/opt/hive/lib/hive-service-3.1.3.jar::/opt/hive/lib/hive-exec-3.1.3.jar::/opt/hive/lib/hive-jdbc-3.1.3.jar:/opt/hive/lib/hive-jdbc-handler-3.1.3.jar::/opt/hive/lib/jackson-annotations-2.12.0.jar:/opt/hive/lib/jackson-core-2.12.0.jar:/opt/hive/lib/jackson-core-asl-1.9.13.jar:/opt/hive/lib/jackson-databind-2.12.0.jar:/opt/hive/lib/jackson-dataformat-smile-2.12.0.jar:/opt/hive/lib/jackson-mapper-asl-1.9.13.jar:/opt/hive/lib/jackson-module-scala_2.11-2.12.0.jar::/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/mapreduce/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hive/lib/*:/lib/spark/jars/aws-java-sdk-bundle-1.12.129.jar:/lib/spark/jars/hadoop-aws-3.2.0.jar:/opt/hadoop/etc/hadoop:/opt/hudi/hudi-sync/hudi-hive-sync/../../packaging/hudi-hive-sync-bundle/target/hudi-hive-sync-bundle-0.11.0.jar org.apache.hudi.hive.HiveSyncTool --database gpr - -table runs --metastore-uris thrift://hive-metastore:9083 --base-path s3a://wavesense-test-hudi/runs/ --sync-mode hms SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hive/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2022-06-06 15:55:59,377 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(60)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2022-06-06 15:55:59,588 INFO [main] impl.MetricsConfig (MetricsConfig.java:loadFirst(118)) - Loaded properties from hadoop-metrics2.properties 2022-06-06 15:55:59,644 INFO [main] impl.MetricsSystemImpl (MetricsSystemImpl.java:startTimer(374)) - Scheduled Metric snapshot period at 10 second(s). 2022-06-06 15:55:59,644 INFO [main] impl.MetricsSystemImpl (MetricsSystemImpl.java:start(191)) - s3a-file-system metrics system started 2022-06-06 15:56:01,258 INFO [main] conf.HiveConf (HiveConf.java:findConfigFile(187)) - Found configuration file null 2022-06-06 15:56:01,453 INFO [main] table.HoodieTableMetaClient (HoodieTableMetaClient.java:<init>(117)) - Loading HoodieTableMetaClient from s3a://wavesense-test-hudi/runs/ 2022-06-06 15:56:01,616 INFO [main] table.HoodieTableConfig (HoodieTableConfig.java:<init>(242)) - Loading table properties from s3a://wavesense-test-hudi/runs/.hoodie/hoodie.properties 2022-06-06 15:56:01,685 INFO [main] table.HoodieTableMetaClient (HoodieTableMetaClient.java:<init>(136)) - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from s3a://wavesense-test-hudi/runs/ 2022-06-06 15:56:01,685 INFO [main] table.HoodieTableMetaClient (HoodieTableMetaClient.java:<init>(139)) - Loading Active commit timeline for s3a://wavesense-test-hudi/runs/ 2022-06-06 15:56:01,813 INFO [main] timeline.HoodieActiveTimeline (HoodieActiveTimeline.java:<init>(131)) - Loaded instants upto : Option{val=[20220524201523059__clean__COMPLETED]} 2022-06-06 15:56:02,198 INFO [main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(441)) - Trying to connect to metastore with URI thrift://hive-metastore.default.svc.cluster.local:9083 2022-06-06 15:56:02,216 INFO [main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(517)) - Opened a connection to metastore, current connections: 1 2022-06-06 15:56:02,233 INFO [main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(570)) - Connected to metastore. 2022-06-06 15:56:02,234 INFO [main] metastore.RetryingMetaStoreClient (RetryingMetaStoreClient.java:<init>(97)) - RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=root (auth:SIMPLE) retries=1 delay=1 lifetime=0 2022-06-06 15:56:02,368 INFO [main] hive.HiveSyncTool (HiveSyncTool.java:syncHoodieTable(135)) - Syncing target hoodie table with hive table(runs). Hive metastore URL :jdbc:hive2://localhost:10000, basePath :s3a://wavesense-test-hudi/runs/ 2022-06-06 15:56:02,368 INFO [main] hive.HiveSyncTool (HiveSyncTool.java:syncHoodieTable(177)) - Trying to sync hoodie table runs with base path s3a://wavesense-test-hudi/runs of type COPY_ON_WRITE 2022-06-06 15:56:02,689 INFO [main] table.TableSchemaResolver (TableSchemaResolver.java:readSchemaFromParquetBaseFile(479)) - Reading schema from s3a://wavesense-test-hudi/runs/2022-02/61d91e74-8331-492c-89ce-4ae7792cd8cc-0_0-13934-418302_20220524201445667.parquet 2022-06-06 15:56:02,789 INFO [main] s3a.S3AInputStream (S3AInputStream.java:seekInStream(286)) - Switching to Random IO seek policy 2022-06-06 15:56:03,092 INFO [main] hive.HiveSyncTool (HiveSyncTool.java:syncSchema(259)) - Hive table runs is not found. Creating it 2022-06-06 15:56:03,203 ERROR [main] ddl.HMSDDLExecutor (HMSDDLExecutor.java:createTable(129)) - failed to create table runs MetaException(message:Got exception: java.nio.file.AccessDeniedException s3a://wavesense-test-hudi/runs: getFileStatus on s3a://wavesense-test-hudi/runs: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: 2FH0H0BYJ8TQ07YM; S3 Extended Request ID: noYjCZuh8UjxAo4xWLGdXW4o1DJnIPmdde8WngnDnr7B6tEeYiTBn7BTQscdHPexdvlgZbx2cqw=), S3 Extended Request ID: noYjCZuh8UjxAo4xWLGdXW4o1DJnIPmdde8WngnDnr7B6tEeYiTBn7BTQscdHPexdvlgZbx2cqw=:403 Forbidden) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:54908) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:54876) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result.read(ThriftHiveMetastore.java:54802) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:1556) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:1542) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2867) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.create_table_with_environment_context(SessionHiveMetaStoreClient.java:121) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:837) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:822) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) at com.sun.proxy.$Proxy21.createTable(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2773) at com.sun.proxy.$Proxy21.createTable(Unknown Source) at org.apache.hudi.hive.ddl.HMSDDLExecutor.createTable(HMSDDLExecutor.java:127) at org.apache.hudi.hive.HoodieHiveClient.createTable(HoodieHiveClient.java:168) at org.apache.hudi.hive.HiveSyncTool.syncSchema(HiveSyncTool.java:276) at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:217) at org.apache.hudi.hive.HiveSyncTool.doSync(HiveSyncTool.java:150) at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:138) at org.apache.hudi.hive.HiveSyncTool.main(HiveSyncTool.java:433) 2022-06-06 15:56:03,208 INFO [main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:close(600)) - Closed a connection to metastore, current connections: 0 Exception in thread "main" org.apache.hudi.exception.HoodieException: Got runtime exception when hive syncing runs at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:141) at org.apache.hudi.hive.HiveSyncTool.main(HiveSyncTool.java:433) Caused by: org.apache.hudi.hive.HoodieHiveSyncException: failed to create table runs at org.apache.hudi.hive.ddl.HMSDDLExecutor.createTable(HMSDDLExecutor.java:130) at org.apache.hudi.hive.HoodieHiveClient.createTable(HoodieHiveClient.java:168) at org.apache.hudi.hive.HiveSyncTool.syncSchema(HiveSyncTool.java:276) at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:217) at org.apache.hudi.hive.HiveSyncTool.doSync(HiveSyncTool.java:150) at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:138) ... 1 more Caused by: MetaException(message:Got exception: java.nio.file.AccessDeniedException s3a://wavesense-test-hudi/runs: getFileStatus on s3a://wavesense-test-hudi/runs: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: 2FH0H0BYJ8TQ07YM; S3 Extended Request ID: noYjCZuh8UjxAo4xWLGdXW4o1DJnIPmdde8WngnDnr7B6tEeYiTBn7BTQscdHPexdvlgZbx2cqw=), S3 Extended Request ID: noYjCZuh8UjxAo4xWLGdXW4o1DJnIPmdde8WngnDnr7B6tEeYiTBn7BTQscdHPexdvlgZbx2cqw=:403 Forbidden) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:54908) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:54876) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result.read(ThriftHiveMetastore.java:54802) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:1556) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:1542) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2867) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.create_table_with_environment_context(SessionHiveMetaStoreClient.java:121) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:837) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:822) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) at com.sun.proxy.$Proxy21.createTable(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2773) at com.sun.proxy.$Proxy21.createTable(Unknown Source) at org.apache.hudi.hive.ddl.HMSDDLExecutor.createTable(HMSDDLExecutor.java:127) ... 6 more 2022-06-06 15:56:03,212 INFO [shutdown-hook-0] impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(210)) - Stopping s3a-file-system metrics system... 2022-06-06 15:56:03,212 INFO [shutdown-hook-0] impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(216)) - s3a-file-system metrics system stopped. 2022-06-06 15:56:03,213 INFO [shutdown-hook-0] impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(607)) - s3a-file-system metrics system shutdown complete. root@hudi-cli-59c5dd55f-8gvmd:/opt/hudi/hudi-sync/hudi-hive-sync# -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
