Re: [PR] [#3919] feat(catalog-lakehouse-paimon): Support hive backend for Paimon Catalog [gravitino]

2024-10-15 Thread via GitHub


caican00 commented on code in PR #5092:
URL: https://github.com/apache/gravitino/pull/5092#discussion_r1800568074


##
docs/lakehouse-paimon-catalog.md:
##
@@ -22,38 +22,40 @@ Builds with Apache Paimon `0.8.0`.
 
 ### Catalog capabilities
 
-- Works as a catalog proxy, supporting `FilesystemCatalog` and `JdbcCatalog`.
+- Works as a catalog proxy, supporting `FilesystemCatalog`, `JdbcCatalog` and 
`HiveCatalog`.
 - Supports DDL operations for Paimon schemas and tables.
 
-- Doesn't support `HiveCatalog` catalog backend now.
 - Doesn't support alterSchema.
 
 ### Catalog properties
 
-| Property name  | Description 


| Default value  | Required 
   | Since Version |
-||-||-|---|
-| `catalog-backend`  | Catalog backend of 
Gravitino Paimon catalog. Supports `filesystem` and `jdbc` now. 

 | (none) | Yes 
| 0.6.0-incubating  |
-| `uri`  | The URI configuration 
of the Paimon catalog. `thrift://127.0.0.1:9083` or 
`jdbc:postgresql://127.0.0.1:5432/db_name` or 
`jdbc:mysql://127.0.0.1:3306/metastore_db`. It is optional for 
`FilesystemCatalog`. | (none) | required if the value of 
`catalog-backend` is not `filesystem`. | 0.6.0-incubating  |
-| `warehouse`| Warehouse directory of 
catalog. `file:///user/hive/warehouse-paimon/` for local fs, 
`hdfs://namespace/hdfs/path` for HDFS , `s3://{bucket-name}/path/` for S3 or 
`oss://{bucket-name}/path` for Aliyun OSS  | (none) | Yes   
  | 0.6.0-incubating  |
-| `authentication.type`  | The type of 
authentication for Paimon catalog backend, currently Gravitino only supports 
`Kerberos` and `simple`.
   | `simple`   | No
  | 0.6.0-incubating  |
-| `authentication.kerberos.principal`| The principal of the 
Kerberos authentication.

   | (none) | required if the value of 
`authentication.type` is Kerberos. | 0.6.0-incubating  |
-| `authentication.kerberos.keytab-uri`   | The URI of The keytab 
for the Kerberos authentication.

  | (none) | required if the value of 
`authentication.type` is Kerberos. | 0.6.0-incubating  |
-| `authentication.kerberos.check-interval-sec`   | The check interval of 
Kerberos credential for Paimon catalog. 

  | 60 | No 
 | 0.6.0-incubating  |
-| `authentication.kerberos.keytab-fetch-timeout-sec` | The fetch timeout of 
retrieving Kerberos keytab from `authentication.kerberos.keytab-uri`.   

   | 60 | No
  | 0.6.0-incubating  |
-| `oss-endpoint` | The endpoint of the 
Aliyun oss. 

| (none) | required if the value of 
`warehouse` is a oss path  | 0.7.0-incubating  |
-| `oss-access-key-id`| The access key of the 
Aliyun oss. 

   

[PR] [#4014] feat(catalog-lakehouse-paimon): Support kerberos auth for Paimon HiveCatalog [gravitino]

2024-10-15 Thread via GitHub


caican00 opened a new pull request, #5136:
URL: https://github.com/apache/gravitino/pull/5136

   ### What changes were proposed in this pull request?
   
   Support kerberos auth for Paimon HiveCatalog.
   
   ### Why are the changes needed?
   
   Fix: https://github.com/apache/gravitino/issues/4014
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   New ITs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Bug report] HMS HA Feature unsupported [gravitino]

2024-10-15 Thread via GitHub


xxzhky opened a new issue, #5137:
URL: https://github.com/apache/gravitino/issues/5137

   ### Version
   
   0.6.0
   
   ### Describe what's wrong
   
   Gravitino catalog 
configProperties td {white-space:nowrap;border:1px solid 
#dee0e3;font-size:10pt;font-style:normal;font-weight:normal;vertical-align:middle;word-break:normal;word-wrap:normal;}
   Key | Value
   -- | --
   gravitino.bypass.hive.metastore.client.capability.check | FALSE
   metastore.uris | 
thrift://ecs-dev-66-133-flink.msxf.host:9089,thrift://ecs-dev-66-100-flink.msxf.host:9089
   kerberos.principal | hive/ecs-dev-64-179-flink.msxf.host@DPOPSTEST.HADOOP
   gravitino.bypass.hive.metastore.kerberos.principal | 
hive/ecs-dev-66-133-flink.msxf.host@DPOPSTEST.HADOOP
   kerberos.keytab-uri | file:///home/xdt/gravikey/hms4/hive.keytab
   gravitino.bypass.hive.metastore.sasl.enabled | TRUE
   gravitino.bypass.hadoop.security.authentication | kerberos
   
   ReasonWhen 
configuring the Iceberg catalog, attempting to use multiple Hive Metastore 
instances to meet the production environment's high-availability (HA) 
requirements results in the following error:transport.TSaslTransport: SASL negotiation failure
   javax.security.sasl.SaslException: GSS initiate 
failedNote:For the 
relevant catalog configuration information, please refer to the table above. 
Specifically:metastore.uris is configured with two 
instances.gravitino.bypass.hive.metastore.kerberos.principal
 is configured with two principals.
   
   ### Error message and/or stacktrace
   
   2024-10-12T10:52:10,393 ERROR [Metastore-Handler-Pool: Thread-63] 
transport.TSaslTransport: SASL negotiation failure
   javax.security.sasl.SaslException: GSS initiate failed
   at 
com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:199)
 ~[?:1.8.0_232]
   at 
org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:507)
 ~[hive-exec-4.0.0.jar:4.0.0]
   at 
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:250) 
~[hive-exec-4.0.0.jar:4.0.0]
   at 
org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:44)
 ~[hive-exec-4.0.0.jar:4.0.0]
   at 
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:199)
 ~[hive-exec-4.0.0.jar:4.0.0]
   at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:711)
 ~[hive-exec-4.0.0.jar:4.0.0]
   at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:707)
 ~[hive-exec-4.0.0.jar:4.0.0]
   at java.security.AccessController.doPrivileged(Native Method) 
~[?:1.8.0_232]
   at javax.security.auth.Subject.doAs(Subject.java:360) ~[?:1.8.0_232]
   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1855)
 ~[hadoop-common-3.3.4.jar:?]
   at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:707)
 ~[hive-exec-4.0.0.jar:4.0.0]
   at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:227)
 ~[hive-exec-4.0.0.jar:4.0.0]
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_232]
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_232]
   at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_232]
   Caused by: org.ietf.jgss.GSSException: Failure unspecified at GSS-API level 
(Mechanism level: Checksum failed)
   at 
sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:858) 
~[?:1.8.0_232]
   at 
sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342) 
~[?:1.8.0_232]
   at 
sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285) 
~[?:1.8.0_232]
   at 
com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:167)
 ~[?:1.8.0_232]
   ... 14 more
   Caused by: sun.security.krb5.KrbCryptoException: Checksum failed
   at 
sun.security.krb5.internal.crypto.Aes128CtsHmacSha1EType.decrypt(Aes128CtsHmacSha1EType.java:102)
 ~[?:1.8.0_232]
   at 
sun.security.krb5.internal.crypto.Aes128CtsHmacSha1EType.decrypt(Aes128CtsHmacSha1EType.java:94)
 ~[?:1.8.0_232]
   at sun.security.krb5.EncryptedData.decrypt(EncryptedData.java:175) 
~[?:1.8.0_232]
   at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:281) 
~[?:1.8.0_232]
   at sun.security.krb5.KrbApReq.(KrbApReq.java:149) 
~[?:1.8.0_232]
   at 
sun.security.jgss.krb5.InitSecContextToken.(InitSecContextToken.java:108) 
~[?:1.8.0_232]
   at 
sun.security.jg

Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluginized manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


yuqi1129 commented on PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#issuecomment-2413069338

   @jerryshao 
   The code has been updated. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Improvement] Response code from Gravitino server will make frontend page 404 [gravitino]

2024-10-15 Thread via GitHub


jerqi commented on issue #4831:
URL: https://github.com/apache/gravitino/issues/4831#issuecomment-2413076236

   For access control API, I will change some 404 code to 400. More details you 
can see https://github.com/apache/gravitino/issues/5105


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#3973] improvement(core):support audit log [gravitino]

2024-10-15 Thread via GitHub


FANNG1 commented on code in PR #4575:
URL: https://github.com/apache/gravitino/pull/4575#discussion_r1800987291


##
core/src/main/java/org/apache/gravitino/audit/FileAuditWriter.java:
##
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.gravitino.audit;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+import java.io.FileNotFoundException;
+import java.io.FileOutputStream;
+import java.io.OutputStream;
+import java.io.OutputStreamWriter;
+import java.io.Writer;
+import java.nio.charset.StandardCharsets;
+import java.util.Map;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.gravitino.exceptions.GravitinoRuntimeException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * DefaultFileAuditWriter is the default implementation of AuditLogWriter, 
which writes audit logs
+ * to a file.
+ */
+public class FileAuditWriter implements AuditLogWriter {
+  private static final Logger Log = 
LoggerFactory.getLogger(FileAuditWriter.class);
+
+  public static final String LOG_FILE_NAME_CONFIG = "name";
+
+  public static final String FILE_IMMEDIATE_FLUSH_CONFIG = "immediateFlush";
+
+  private Formatter formatter;
+  private Writer outWriter;
+  @VisibleForTesting String fileName;
+
+  boolean immediateFlush;
+
+  @Override
+  public Formatter getFormatter() {
+return formatter;
+  }
+
+  @Override
+  public void init(Formatter formatter, Map properties) {
+this.formatter = formatter;
+fileName = properties.getOrDefault(LOG_FILE_NAME_CONFIG, 
"default_gravitino_audit_log");
+immediateFlush =
+
Boolean.parseBoolean(properties.getOrDefault(FILE_IMMEDIATE_FLUSH_CONFIG, 
"false"));
+Preconditions.checkArgument(
+StringUtils.isNotBlank(fileName), "FileAuditWriter: fileName is not 
set in configuration.");
+try {
+  OutputStream outputStream = new FileOutputStream(fileName, true);
+  outWriter = new OutputStreamWriter(outputStream, StandardCharsets.UTF_8);
+} catch (FileNotFoundException e) {
+  throw new GravitinoRuntimeException(
+  String.format("Audit log file: %s is not exists", fileName));

Review Comment:
   `is not exists` seems not correct, there may be other reason like permission.



##
core/src/main/java/org/apache/gravitino/audit/Formatter.java:
##
@@ -0,0 +1,34 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.gravitino.audit;
+
+import org.apache.gravitino.listener.api.event.Event;
+
+/** The interface defined the conversions of metadata change event to unified 
log format. */
+public interface Formatter {
+
+  /**
+   * Format the event, returning the unified audit log format.
+   *
+   * @param event The event to format.
+   * @return The formatted event.

Review Comment:
   `@return The formatted event` not correct



##
core/src/main/java/org/apache/gravitino/audit/SimpleAuditLog.java:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the Li

Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1801108915


##
clients/filesystem-hadoop3/build.gradle.kts:
##
@@ -71,6 +75,11 @@ tasks.build {
   dependsOn("javadoc")
 }
 
+tasks.compileJava {

Review Comment:
   Why do we need this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1801125699


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -119,26 +125,35 @@ public void initialize(
   Map config, CatalogInfo info, HasPropertyMetadata 
propertiesMetadata)
   throws RuntimeException {
 this.propertiesMetadata = propertiesMetadata;
-// Initialize Hadoop Configuration.
-this.conf = config;
-this.hadoopConf = new Configuration();
 this.catalogInfo = info;
-Map bypassConfigs =
-config.entrySet().stream()
-.filter(e -> e.getKey().startsWith(CATALOG_BYPASS_PREFIX))
-.collect(
-Collectors.toMap(
-e -> e.getKey().substring(CATALOG_BYPASS_PREFIX.length()),
-Map.Entry::getValue));
-bypassConfigs.forEach(hadoopConf::set);
+
+this.conf = config;
+
+String fileSystemProviders =
+(String)
+propertiesMetadata
+.catalogPropertiesMetadata()
+.getOrDefault(
+config, 
HadoopCatalogPropertiesMetadata.FILESYSTEM_PROVIDERS_CLASSNAMES);
+FileSystemUtils.initFileSystemProviders(fileSystemProviders, 
fileSystemProvidersMap);
+
+String defaultFileSystemProviderClassName =
+(String)
+propertiesMetadata
+.catalogPropertiesMetadata()
+.getOrDefault(
+config, 
HadoopCatalogPropertiesMetadata.DEFAULT_FS_PROVIDER_CLASSNAME);
+this.defaultFileSystemProviderScheme =
+StringUtils.isNotBlank(defaultFileSystemProviderClassName)
+? FileSystemUtils.getSchemeByFileSystemProvider(
+defaultFileSystemProviderClassName, fileSystemProvidersMap)
+: LOCAL_FILE_SCHEME;

Review Comment:
   You can create the default fs provider and get scheme from default fs 
provider, no need to use this logic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1801119181


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogPropertiesMetadata.java:
##
@@ -34,6 +36,21 @@ public class HadoopCatalogPropertiesMetadata extends 
BaseCatalogPropertiesMetada
   // If not, users have to specify the storage location in the Schema or 
Fileset level.
   public static final String LOCATION = "location";
 
+  /**
+   * The class names of {@link FileSystemProvider} to be added to the catalog. 
Except built-in
+   * FileSystemProvider like LocalFileSystemProvider and 
HDFSFileSystemProvider, users can add their
+   * own FileSystemProvider by specifying the class name here. The value can be
+   * '.yyy.FileSystemProvider1,.yyy.FileSystemProvider2'.
+   */
+  public static final String FILESYSTEM_PROVIDERS_CLASSNAMES = 
"filesystem-providers-classnames";

Review Comment:
   The property name here and below is too long, you'd better figure out a 
better name, I think "filesystem-providers" and "default-filesystem-provider" 
should be fine.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


yuqi1129 commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1801124561


##
clients/filesystem-hadoop3/build.gradle.kts:
##
@@ -71,6 +75,11 @@ tasks.build {
   dependsOn("javadoc")
 }
 
+tasks.compileJava {

Review Comment:
   This module depends on `catalog-hadoop`, however, due to the fact that the 
output directory of task `:catalogs:catalog-hadoop:jar` and 
`:catalogs:catalog-hadoop:runtimeJars` are the same (lib/build/), so we need to 
add these two dependency or gradle compile will fail, the same goes for others 
module 
   
   ```gradle
   tasks.test {
 val skipITs = project.hasProperty("skipITs")
 if (skipITs) {
   exclude("**/integration/test/**")
 } else {
   dependsOn(":catalogs:catalog-hadoop:jar", 
":catalogs:catalog-hadoop:runtimeJars")
   dependsOn(":catalogs:catalog-hive:jar", 
":catalogs:catalog-hive:runtimeJars")
   dependsOn(":catalogs:catalog-kafka:jar", 
":catalogs:catalog-kafka:runtimeJars")
 }
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5086] core(feat): Add logic to support manipulating columns in TableOperationDispatcher [gravitino]

2024-10-15 Thread via GitHub


FANNG1 commented on code in PR #5127:
URL: https://github.com/apache/gravitino/pull/5127#discussion_r1801127715


##
catalogs/catalog-jdbc-common/src/main/java/org/apache/gravitino/catalog/jdbc/operation/JdbcTableOperations.java:
##
@@ -168,6 +168,15 @@ protected JdbcTable.Builder getTableBuilder(
 return builder;
   }
 
+  protected JdbcColumn.Builder getColumnBuilder(

Review Comment:
   How about adding the `TABLE_SCHEM` check here, and remove the 
`getColumnBuilder ` in PG?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#4951] improvement(test): Reduce fields shares between different IT to make AbstractIT more independent. [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on PR #4996:
URL: https://github.com/apache/gravitino/pull/4996#issuecomment-2413298239

   Please address the conflicts here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [#5105] improvement(server,client): Error code optimization about access control API [gravitino]

2024-10-15 Thread via GitHub


jerqi opened a new pull request, #5144:
URL: https://github.com/apache/gravitino/pull/5144

   ### What changes were proposed in this pull request?
   
   Error code optimization about access control API
   
   ### Why are the changes needed?
   
   Fix: #5105 
   
   ### Does this PR introduce _any_ user-facing change?
   Yes.
   
   ### How was this patch tested?
   
   Modify some UTs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#3973] improvement(core):support audit log [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on PR #4575:
URL: https://github.com/apache/gravitino/pull/4575#issuecomment-2413271699

   @FANNG1 can you please do another round of review?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5145][flink] Shrink the flink-connector-runtime lib size from 84MB to 16MB [gravitino]

2024-10-15 Thread via GitHub


coolderli commented on PR #4819:
URL: https://github.com/apache/gravitino/pull/4819#issuecomment-2413356596

   @FANNG1 @mchades I have addressed the comment. Please take a look. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#3973] improvement(core):support audit log [gravitino]

2024-10-15 Thread via GitHub


FANNG1 commented on code in PR #4575:
URL: https://github.com/apache/gravitino/pull/4575#discussion_r1801040725


##
core/src/main/java/org/apache/gravitino/audit/SimpleAuditLog.java:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.gravitino.audit;
+
+import java.text.SimpleDateFormat;
+import lombok.Builder;
+
+/** The default implementation of the audit log. */
+@Builder
+public class SimpleAuditLog implements AuditLog {
+
+  private String user;
+
+  private Operation operation;
+
+  private String identifier;
+
+  private long timestamp;
+
+  private Status status;
+
+  @Override
+  public String user() {
+return user;
+  }
+
+  @Override
+  public Operation operation() {
+return operation;
+  }
+
+  @Override
+  public String identifier() {
+return identifier;
+  }
+
+  @Override
+  public long timestamp() {
+return timestamp;
+  }
+
+  @Override
+  public Status status() {
+return status;
+  }
+
+  @Override
+  public String toString() {
+return String.format(
+"[%s] %s %s %s %s",
+new SimpleDateFormat("-MM-dd HH:mm:ss").format(timestamp),

Review Comment:
   We use spaces as separators for different items, seems we should avoid using 
it in time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [SIP] support Iceberg event listener [gravitino]

2024-10-15 Thread via GitHub


FANNG1 opened a new pull request, #5147:
URL: https://github.com/apache/gravitino/pull/5147

   
   
   ### What changes were proposed in this pull request?
   
   (Please outline the changes and how this PR fixes the issue.)
   
   ### Why are the changes needed?
   
   (Please clarify why the changes are needed. For instance,
 1. If you propose a new API, clarify the use case for a new API.
 2. If you fix a bug, describe the bug.)
   
   Fix: # (issue)
   
   ### Does this PR introduce _any_ user-facing change?
   
   (Please list the user-facing changes introduced by your change, including
 1. Change in user-facing APIs.
 2. Addition or removal of property keys.)
   
   ### How was this patch tested?
   
   (Please test your changes, and provide instructions on how to test it:
 1. If you add a feature or fix a bug, add a test to cover your changes.
 2. If you fix a flaky test, repeat it for many times to prove it works.)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Shrink the flink-connector-runtime lib size from 84MB to 16MB [gravitino]

2024-10-15 Thread via GitHub


mchades commented on PR #4819:
URL: https://github.com/apache/gravitino/pull/4819#issuecomment-2413161057

   This is not a minor change, could you plz create a new subtask issue of 
#4513 to record this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#4951] improvement(test): Reduce fields shares between different IT to make AbstractIT more independent. [gravitino]

2024-10-15 Thread via GitHub


yuqi1129 commented on code in PR #4996:
URL: https://github.com/apache/gravitino/pull/4996#discussion_r1800653954


##
trino-connector/integration-test/src/test/java/org/apache/gravitino/trino/connector/integration/test/TrinoQueryITBase.java:
##
@@ -68,14 +68,17 @@ public class TrinoQueryITBase {
   protected static final String metalakeName = "test";
   protected static GravitinoMetalake metalake;
 
-  private static void setEnv() throws Exception {
+  private static BaseIT baseIT;
+
+  private void setEnv() throws Exception {
+baseIT = new BaseIT();

Review Comment:
   I think so, refactoring this part will be another work for me. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Shrink the flink-connector-runtime lib size from 84MB to 16MB [gravitino]

2024-10-15 Thread via GitHub


coolderli commented on code in PR #4819:
URL: https://github.com/apache/gravitino/pull/4819#discussion_r1800654345


##
flink-connector/flink-runtime/build.gradle.kts:
##
@@ -67,6 +66,11 @@ publishing {
   }
 }
 
+tasks.register("copy") {

Review Comment:
   OK



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluginized manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1800657095


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -119,26 +125,28 @@ public void initialize(
   Map config, CatalogInfo info, HasPropertyMetadata 
propertiesMetadata)
   throws RuntimeException {
 this.propertiesMetadata = propertiesMetadata;
-// Initialize Hadoop Configuration.
-this.conf = config;
-this.hadoopConf = new Configuration();
 this.catalogInfo = info;
-Map bypassConfigs =
-config.entrySet().stream()
-.filter(e -> e.getKey().startsWith(CATALOG_BYPASS_PREFIX))
-.collect(
-Collectors.toMap(
-e -> e.getKey().substring(CATALOG_BYPASS_PREFIX.length()),
-Map.Entry::getValue));
-bypassConfigs.forEach(hadoopConf::set);
+
+this.conf = config;
+
+String fileSystemProviders =
+(String)
+propertiesMetadata
+.catalogPropertiesMetadata()
+.getOrDefault(config, 
HadoopCatalogPropertiesMetadata.FILESYSTEM_PROVIDERS);
+FileSystemUtils.initFileSystemProviders(fileSystemProviders, 
fileSystemProvidersMap);
+
+this.defaultFilesystemProvider =
+(String)
+propertiesMetadata
+.catalogPropertiesMetadata()
+.getOrDefault(config, 
HadoopCatalogPropertiesMetadata.DEFAULT_FS_PROVIDER);

Review Comment:
   What is the default value for this property, should if be "file"?



##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -742,4 +750,35 @@ private boolean checkSingleFile(Fileset fileset) {
   fileset.name());
 }
   }
+
+  FileSystem getFileSystem(Path path, Map config) throws 
IOException {
+if (path == null) {
+  throw new IllegalArgumentException("Path should not be null");
+}
+
+// Can't get the scheme from the path like '/path/to/file', use the 
default filesystem provider.
+if (path.toUri().getScheme() == null) {
+  if (defaultFilesystemProvider != null) {
+return getFileSystemByScheme(defaultFilesystemProvider, config, path);
+  }
+
+  LOG.warn(
+  "Can't get schema from path: {} and default filesystem provider is 
null, using"

Review Comment:
   scheme...



##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -742,4 +750,35 @@ private boolean checkSingleFile(Fileset fileset) {
   fileset.name());
 }
   }
+
+  FileSystem getFileSystem(Path path, Map config) throws 
IOException {
+if (path == null) {
+  throw new IllegalArgumentException("Path should not be null");
+}
+
+// Can't get the scheme from the path like '/path/to/file', use the 
default filesystem provider.
+if (path.toUri().getScheme() == null) {
+  if (defaultFilesystemProvider != null) {
+return getFileSystemByScheme(defaultFilesystemProvider, config, path);
+  }
+
+  LOG.warn(
+  "Can't get schema from path: {} and default filesystem provider is 
null, using"
+  + " local file system",
+  path);
+  return getFileSystemByScheme(LOCAL_FILE_SCHEME, config, path);
+}
+
+return getFileSystemByScheme(path.toUri().getScheme(), config, path);
+  }
+
+  private FileSystem getFileSystemByScheme(String scheme, Map 
config, Path path)
+  throws IOException {
+FileSystemProvider provider = fileSystemProvidersMap.get(scheme);
+if (provider == null) {
+  throw new IllegalArgumentException("Unsupported scheme: " + scheme);

Review Comment:
   You should clearly tell user why the exception is happened, the code here is 
not easy for user to know the actual reason.



##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -90,6 +90,10 @@ public class HadoopCatalogOperations implements 
CatalogOperations, SupportsSchem
 
   private CatalogInfo catalogInfo;
 
+  private final Map fileSystemProvidersMap = 
Maps.newHashMap();
+
+  private String defaultFilesystemProvider;

Review Comment:
   You can maintain a default `FileSystemProvider`, not just a string. Besides, 
it is `FileSystem`, not `FileSystem`.



##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -742,4 +750,35 @@ private boolean checkSingleFile(Fileset fileset) {
   fileset.name());
 }
   }
+
+  FileSystem getFileSystem(Path path, Map config) throws 
IOException {
+if (path == null) {
+  throw new IllegalArgumentException("Path should not be null");
+}
+
+// Can't get the scheme from the path like '/path/to/file', use the 
default filesystem provider.
+if (path.toUri().getScheme(

Re: [PR] [#3973] improvement(core):support audit log [gravitino]

2024-10-15 Thread via GitHub


FANNG1 commented on code in PR #4575:
URL: https://github.com/apache/gravitino/pull/4575#discussion_r1800978054


##
core/src/main/java/org/apache/gravitino/audit/AuditLogWriter.java:
##
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.gravitino.audit;
+
+import java.io.Closeable;
+import java.util.Map;
+import org.apache.gravitino.listener.api.event.Event;
+
+/**
+ * Interface for writing the audit log, which can write to different storage, 
such as file,
+ * database,mq.
+ */
+public interface AuditLogWriter extends Closeable {
+
+  /** @return formatter. */
+  Formatter getFormatter();
+
+  /**
+   * Initialize the writer with the given configuration.
+   *
+   * @param properties
+   */
+  void init(Formatter formatter, Map properties);
+
+  /**
+   * Write the audit event to storage.
+   *
+   * @param auditLog
+   */
+  void doWrite(AuditLog auditLog);
+
+  /**
+   * Write the audit event to storage.
+   *
+   * @param event
+   */
+  default void write(Event event) {
+doWrite(getFormatter().format(event));
+  }
+
+  /** Close the writer. */
+  void close();

Review Comment:
   why re exporting `close` method in `AuditLogWriter` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


yuqi1129 commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1801226391


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -119,26 +125,35 @@ public void initialize(
   Map config, CatalogInfo info, HasPropertyMetadata 
propertiesMetadata)
   throws RuntimeException {
 this.propertiesMetadata = propertiesMetadata;
-// Initialize Hadoop Configuration.
-this.conf = config;
-this.hadoopConf = new Configuration();
 this.catalogInfo = info;
-Map bypassConfigs =
-config.entrySet().stream()
-.filter(e -> e.getKey().startsWith(CATALOG_BYPASS_PREFIX))
-.collect(
-Collectors.toMap(
-e -> e.getKey().substring(CATALOG_BYPASS_PREFIX.length()),
-Map.Entry::getValue));
-bypassConfigs.forEach(hadoopConf::set);
+
+this.conf = config;
+
+String fileSystemProviders =
+(String)
+propertiesMetadata
+.catalogPropertiesMetadata()
+.getOrDefault(
+config, 
HadoopCatalogPropertiesMetadata.FILESYSTEM_PROVIDERS_CLASSNAMES);
+FileSystemUtils.initFileSystemProviders(fileSystemProviders, 
fileSystemProvidersMap);
+
+String defaultFileSystemProviderClassName =
+(String)
+propertiesMetadata
+.catalogPropertiesMetadata()
+.getOrDefault(
+config, 
HadoopCatalogPropertiesMetadata.DEFAULT_FS_PROVIDER_CLASSNAME);
+this.defaultFileSystemProviderScheme =
+StringUtils.isNotBlank(defaultFileSystemProviderClassName)
+? FileSystemUtils.getSchemeByFileSystemProvider(
+defaultFileSystemProviderClassName, fileSystemProvidersMap)
+: LOCAL_FILE_SCHEME;

Review Comment:
   If we use the service load mechanism to load file system providers, I can't 
get why we need to provide the configuration `filesystem-providers`, please see 
below
   
   ```java
   
   // fileSystemProviders is useless if we use ServiceLoader 
 public static Map 
getFileSystemProviders(String fileSystemProviders) {
   Map resultMap = Maps.newHashMap();
   ServiceLoader fileSystemProvidersLoader =
   ServiceLoader.load(FileSystemProvider.class);
   fileSystemProvidersLoader.forEach(
   fileSystemProvider -> resultMap.put(fileSystemProvider.name(), 
fileSystemProvider));
   return resultMap;
 }
   ```
   
   then the only configuration we need to provide is 
`default-filesystem-provider`,  @jerryshao what do you think about it, is it 
okay for you?
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5105] improvement(server,client): Error code optimization about access control API [gravitino]

2024-10-15 Thread via GitHub


jerqi closed pull request #5144: [#5105] improvement(server,client): Error code 
optimization about access control API
URL: https://github.com/apache/gravitino/pull/5144


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluginized manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


jerqi commented on PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#issuecomment-2413422271

   `pluginized` -> `plugable`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5105] improvement(server,client): Error code optimization about access control API [gravitino]

2024-10-15 Thread via GitHub


jerqi closed pull request #5144: [#5105] improvement(server,client): Error code 
optimization about access control API
URL: https://github.com/apache/gravitino/pull/5144


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Subtask] Shrink flink-runtime connector binary package size [gravitino]

2024-10-15 Thread via GitHub


FANNG1 closed issue #5145: [Subtask] Shrink flink-runtime connector binary 
package size
URL: https://github.com/apache/gravitino/issues/5145


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Subtask] Shrink flink-runtime connector binary package size [gravitino]

2024-10-15 Thread via GitHub


FANNG1 closed issue #5145: [Subtask] Shrink flink-runtime connector binary 
package size
URL: https://github.com/apache/gravitino/issues/5145


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(gravitino) branch main updated: [#5145] improvement(flink-connector): Shrink the flink-connector-runtime lib size from 84MB to 16MB (#4819)

2024-10-15 Thread fanng
This is an automated email from the ASF dual-hosted git repository.

fanng pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/gravitino.git


The following commit(s) were added to refs/heads/main by this push:
 new f0e7f3675 [#5145] improvement(flink-connector): Shrink the 
flink-connector-runtime lib size from 84MB to 16MB (#4819)
f0e7f3675 is described below

commit f0e7f3675f6875ea8b83b65e940b90e6541a0d82
Author: Peidian li <38486782+coolde...@users.noreply.github.com>
AuthorDate: Tue Oct 15 17:55:32 2024 +0800

[#5145] improvement(flink-connector): Shrink the flink-connector-runtime 
lib size from 84MB to 16MB (#4819)

### What changes were proposed in this pull request?

- shrink the flink-connector-runtime jar.
- Change client-java scope from implementation to CompileOnly in flink
module.
- use the client-java-runtime in flink-connector-runtime.

### Why are the changes needed?
Fix: #5145


- The flink-connector-runtime jar is much big as mentioned in
https://github.com/apache/gravitino/pull/4791#discussion_r1738193021.


### Does this PR introduce _any_ user-facing change?

- no

### How was this patch tested?

- local test
---
 flink-connector/flink-runtime/build.gradle.kts |  1 -
 flink-connector/flink/build.gradle.kts | 20 ++--
 2 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/flink-connector/flink-runtime/build.gradle.kts 
b/flink-connector/flink-runtime/build.gradle.kts
index 63349ba69..1a7164644 100644
--- a/flink-connector/flink-runtime/build.gradle.kts
+++ b/flink-connector/flink-runtime/build.gradle.kts
@@ -56,7 +56,6 @@ tasks.withType(ShadowJar::class.java) {
   relocate("com.google", "org.apache.gravitino.shaded.com.google")
   relocate("google", "org.apache.gravitino.shaded.google")
   relocate("org.apache.hc", "org.apache.gravitino.shaded.org.apache.hc")
-  relocate("org.apache.commons", 
"org.apache.gravitino.shaded.org.apache.commons")
 }
 
 publishing {
diff --git a/flink-connector/flink/build.gradle.kts 
b/flink-connector/flink/build.gradle.kts
index 0cc814876..9e2a48c03 100644
--- a/flink-connector/flink/build.gradle.kts
+++ b/flink-connector/flink/build.gradle.kts
@@ -38,18 +38,10 @@ val scalaVersion: String = "2.12"
 val artifactName = 
"${rootProject.name}-flink-${flinkMajorVersion}_$scalaVersion"
 
 dependencies {
-  implementation(project(":api"))
   implementation(project(":catalogs:catalog-common"))
-  implementation(project(":common"))
-
-  compileOnly(libs.bundles.log4j)
-  implementation(libs.commons.lang3)
   implementation(libs.guava)
-  implementation(libs.httpclient5)
-  implementation(libs.jackson.databind)
-  implementation(libs.jackson.annotations)
-  implementation(libs.jackson.datatype.jdk8)
-  implementation(libs.jackson.datatype.jsr310)
+
+  compileOnly(project(":clients:client-java-runtime", configuration = 
"shadow"))
 
   
compileOnly("org.apache.flink:flink-connector-hive_$scalaVersion:$flinkVersion")
   compileOnly("org.apache.flink:flink-table-common:$flinkVersion")
@@ -76,13 +68,13 @@ dependencies {
 exclude("org.slf4j")
   }
 
-  // replace with client-java-runtime in flink connector runtime
-  compileOnly(project(":clients:client-java"))
-
   testAnnotationProcessor(libs.lombok)
-
   testCompileOnly(libs.lombok)
+
+  testImplementation(project(":api"))
   testImplementation(project(":clients:client-java"))
+  testImplementation(project(":core"))
+  testImplementation(project(":common"))
   testImplementation(project(":integration-test-common", "testArtifacts"))
   testImplementation(project(":server"))
   testImplementation(project(":server-common"))



Re: [PR] [#5145] improvement(flink-connector): Shrink the flink-connector-runtime lib size from 84MB to 16MB [gravitino]

2024-10-15 Thread via GitHub


FANNG1 merged PR #4819:
URL: https://github.com/apache/gravitino/pull/4819


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Improvement] Support Trino-446 version [gravitino]

2024-10-15 Thread via GitHub


xxzhky opened a new issue, #5138:
URL: https://github.com/apache/gravitino/issues/5138

   ### What would you like to be improved?
   
   Gravitino 
connector td {white-space:nowrap;border:1px solid 
#dee0e3;font-size:10pt;font-style:normal;font-weight:normal;vertical-align:middle;word-break:normal;word-wrap:normal;}
   Property | Type | Default Value
   -- | -- | --
   connector.name | string | dlable
   gravitino.metalake | string | xx
   gravitino.uri | string | http://localhost:8090
     |   |  
     |   |  
   
   ReasonWhen using 
the latest Trino v446 version supported by JDK21 (LTS), after configuring the 
above-mentioned connector, the following startup error occurs:Unsupported Trino-446 version. The supported version for 
the Gravitino-Trino-connector is from Trino-435 to 
Trino-439.Could the community consider 
supporting the latest version of the Trino connector?
   
   ### How should we improve?
   
   Would you like to support the lastest version of Trino?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Shrink the flink-connector-runtime lib size from 84MB to 16MB [gravitino]

2024-10-15 Thread via GitHub


FANNG1 commented on code in PR #4819:
URL: https://github.com/apache/gravitino/pull/4819#discussion_r1800634840


##
flink-connector/flink-runtime/build.gradle.kts:
##
@@ -67,6 +66,11 @@ publishing {
   }
 }
 
+tasks.register("copy") {

Review Comment:
   please remove this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Shrink the flink-connector-runtime lib size from 84MB to 16MB [gravitino]

2024-10-15 Thread via GitHub


FANNG1 commented on PR #4819:
URL: https://github.com/apache/gravitino/pull/4819#issuecomment-2413146975

   LGTM except one comment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [EPIC] Add support for creating, editing, and deleting schema [gravitino]

2024-10-15 Thread via GitHub


LauraXia123 opened a new issue, #5140:
URL: https://github.com/apache/gravitino/issues/5140

   ### Describe the proposal
   
   Add support for create/edit/delete schema
   
   ### Task list
   
   rt


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Shrink the flink-connector-runtime lib size from 84MB to 16MB [gravitino]

2024-10-15 Thread via GitHub


coolderli closed pull request #4819: [MINOR] Shrink the flink-connector-runtime 
lib size from 84MB to 16MB
URL: https://github.com/apache/gravitino/pull/4819


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Improvement] Use the plugin storage framework for GVFS client [gravitino]

2024-10-15 Thread via GitHub


yuqi1129 commented on issue #5097:
URL: https://github.com/apache/gravitino/issues/5097#issuecomment-2413113760

   This issue is duplicated with #5019 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5086] core(feat): Add logic to support manipulating columns in TableOperationDispatcher [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on code in PR #5127:
URL: https://github.com/apache/gravitino/pull/5127#discussion_r1800927829


##
catalogs/catalog-jdbc-postgresql/src/main/java/org/apache/gravitino/catalog/postgresql/operation/PostgreSqlTableOperations.java:
##
@@ -108,6 +108,17 @@ protected JdbcTable.Builder getTableBuilder(
 return builder;
   }
 
+  @Override
+  protected JdbcColumn.Builder getColumnBuilder(
+  ResultSet columnsResult, String databaseName, String tableName) throws 
SQLException {
+JdbcColumn.Builder builder = null;
+if (Objects.equals(columnsResult.getString("TABLE_NAME"), tableName)
+&& Objects.equals(columnsResult.getString("TABLE_SCHEM"), 
databaseName)) {
+  builder = getBasicJdbcColumnInfo(columnsResult);
+}
+return builder;
+  }

Review Comment:
   @yuqi1129 @FANNG1 please take a look at this fix.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#4993] feat(iceberg): integrate credential framework to iceberg REST server [gravitino]

2024-10-15 Thread via GitHub


FANNG1 commented on PR #5134:
URL: https://github.com/apache/gravitino/pull/5134#issuecomment-2413566111

   @jerryshao , it's ready for review now, please help to review when you are 
free, thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(gravitino) branch main updated: [#5112] feat(core): support pre event for event listener systems (#5110)

2024-10-15 Thread jshao
This is an automated email from the ASF dual-hosted git repository.

jshao pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/gravitino.git


The following commit(s) were added to refs/heads/main by this push:
 new e9acd15a8 [#5112] feat(core): support pre event for event listener 
systems (#5110)
e9acd15a8 is described below

commit e9acd15a8586f75a2e95a95753030620906aa20b
Author: FANNG 
AuthorDate: Tue Oct 15 15:42:28 2024 +0800

[#5112] feat(core): support pre event for event listener systems (#5110)

### What changes were proposed in this pull request?
support pre event for event listener systems
1. add new `PreEvent` to represent Pre event and only `SYNC` event
listeners could process `PreEvent`
2. keep `Event` as post event to keep compatibility.
3. `EventBus` dispatch event to corresponding event listeners.

### Why are the changes needed?
Fix: #5112

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
add UT
---
 .../gravitino/listener/AsyncQueueListener.java |  49 +---
 .../org/apache/gravitino/listener/EventBus.java|  37 --
 .../listener/EventListenerPluginWrapper.java   |  37 +-
 .../listener/api/EventListenerPlugin.java  |  30 +++--
 .../api/event/{Event.java => BaseEvent.java}   |   4 +-
 .../apache/gravitino/listener/api/event/Event.java |  57 +
 .../gravitino/listener/api/event/PreEvent.java |  31 +
 .../gravitino/listener/DummyEventListener.java |  37 --
 .../listener/TestEventListenerManager.java | 133 ++---
 .../listener/api/event/TestCatalogEvent.java   |  24 ++--
 .../listener/api/event/TestFilesetEvent.java   |  26 ++--
 .../listener/api/event/TestMetalakeEvent.java  |  20 ++--
 .../listener/api/event/TestPartitionEvent.java |  24 ++--
 .../listener/api/event/TestSchemaEvent.java|  20 ++--
 .../listener/api/event/TestTableEvent.java |  24 ++--
 .../listener/api/event/TestTopicEvent.java |  20 ++--
 16 files changed, 375 insertions(+), 198 deletions(-)

diff --git 
a/core/src/main/java/org/apache/gravitino/listener/AsyncQueueListener.java 
b/core/src/main/java/org/apache/gravitino/listener/AsyncQueueListener.java
index 641bc3eb5..18043964d 100644
--- a/core/src/main/java/org/apache/gravitino/listener/AsyncQueueListener.java
+++ b/core/src/main/java/org/apache/gravitino/listener/AsyncQueueListener.java
@@ -29,7 +29,9 @@ import java.util.concurrent.LinkedBlockingQueue;
 import java.util.concurrent.atomic.AtomicBoolean;
 import java.util.concurrent.atomic.AtomicLong;
 import org.apache.gravitino.listener.api.EventListenerPlugin;
+import org.apache.gravitino.listener.api.event.BaseEvent;
 import org.apache.gravitino.listener.api.event.Event;
+import org.apache.gravitino.listener.api.event.PreEvent;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
@@ -44,7 +46,7 @@ public class AsyncQueueListener implements 
EventListenerPlugin {
   private static final String NAME_PREFIX = "async-queue-listener-";
 
   private final List eventListeners;
-  private final BlockingQueue queue;
+  private final BlockingQueue queue;
   private final Thread asyncProcessor;
   private final int dispatcherJoinSeconds;
   private final AtomicBoolean stopped = new AtomicBoolean(false);
@@ -68,20 +70,13 @@ public class AsyncQueueListener implements 
EventListenerPlugin {
   }
 
   @Override
-  public void onPostEvent(Event event) {
-if (stopped.get()) {
-  LOG.warn(
-  "{} drop event: {}, since AsyncQueueListener is stopped",
-  asyncQueueListenerName,
-  event.getClass().getSimpleName());
-  return;
-}
-
-if (queue.offer(event)) {
-  return;
-}
+  public void onPreEvent(PreEvent event) {
+enqueueEvent(event);
+  }
 
-logDropEventsIfNecessary();
+  @Override
+  public void onPostEvent(Event event) {
+enqueueEvent(event);
   }
 
   @Override
@@ -117,8 +112,14 @@ public class AsyncQueueListener implements 
EventListenerPlugin {
   private void processEvents() {
 while (!Thread.currentThread().isInterrupted()) {
   try {
-Event event = queue.take();
-this.eventListeners.forEach(listener -> listener.onPostEvent(event));
+BaseEvent baseEvent = queue.take();
+if (baseEvent instanceof PreEvent) {
+  this.eventListeners.forEach(listener -> 
listener.onPreEvent((PreEvent) baseEvent));
+} else if (baseEvent instanceof Event) {
+  this.eventListeners.forEach(listener -> listener.onPostEvent((Event) 
baseEvent));
+} else {
+  LOG.warn("Unknown event type: {}", 
baseEvent.getClass().getSimpleName());
+}
   } catch (InterruptedException e) {
 LOG.warn("{} event dispatcher thread is interrupted.", 
asyncQueueListenerName);
 break;
@@ -154,4 +155,20 @@ public class AsyncQueueListener implements 

Re: [I] [Improvement] Possble null pointer exception in DorisTableOperations.java [gravitino]

2024-10-15 Thread via GitHub


itsaviu commented on issue #4605:
URL: https://github.com/apache/gravitino/issues/4605#issuecomment-2413153131

   @justinmclean , The changes has been merged to master we can close this 
issue, Thanks
   cc: @yuqi1129 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#4951] improvement(test): Reduce fields shares between different IT to make AbstractIT more independent. [gravitino]

2024-10-15 Thread via GitHub


mchades commented on code in PR #4996:
URL: https://github.com/apache/gravitino/pull/4996#discussion_r1800622096


##
trino-connector/integration-test/src/test/java/org/apache/gravitino/trino/connector/integration/test/TrinoQueryITBase.java:
##
@@ -68,14 +68,17 @@ public class TrinoQueryITBase {
   protected static final String metalakeName = "test";
   protected static GravitinoMetalake metalake;
 
-  private static void setEnv() throws Exception {
+  private static BaseIT baseIT;
+
+  private void setEnv() throws Exception {
+baseIT = new BaseIT();

Review Comment:
   Let me summarize the differences between the two usages:
   Extending `BaseIT` automatically starts the Gravitino server and client 
before the test begins. If a developer wishes to start the Gravitino server 
manually, they should create a `BaseIT` instance, is that correct?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#4951] improvement(test): Reduce fields shares between different IT to make AbstractIT more independent. [gravitino]

2024-10-15 Thread via GitHub


mchades commented on code in PR #4996:
URL: https://github.com/apache/gravitino/pull/4996#discussion_r1800657782


##
trino-connector/integration-test/src/test/java/org/apache/gravitino/trino/connector/integration/test/TrinoQueryITBase.java:
##
@@ -68,14 +68,17 @@ public class TrinoQueryITBase {
   protected static final String metalakeName = "test";
   protected static GravitinoMetalake metalake;
 
-  private static void setEnv() throws Exception {
+  private static BaseIT baseIT;
+
+  private void setEnv() throws Exception {
+baseIT = new BaseIT();

Review Comment:
   If that's the case, I believe adding comments to the class `BaseIT` is 
sufficient and there is no need to refactor it.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Improvement] Use the plugin storage framework for GVFS client [gravitino]

2024-10-15 Thread via GitHub


yuqi1129 closed issue #5097: [Improvement] Use the plugin storage framework for 
GVFS client
URL: https://github.com/apache/gravitino/issues/5097


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5112] feat(core): support pre event for event listener systems [gravitino]

2024-10-15 Thread via GitHub


jerryshao merged PR #5110:
URL: https://github.com/apache/gravitino/pull/5110


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Improvement] Add pre event to event listeners [gravitino]

2024-10-15 Thread via GitHub


jerryshao closed issue #5112: [Improvement] Add pre event to event listeners
URL: https://github.com/apache/gravitino/issues/5112


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#4867] feat(core): Add storage support for columns in Gravitino [gravitino]

2024-10-15 Thread via GitHub


jerryshao merged PR #5078:
URL: https://github.com/apache/gravitino/pull/5078


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Subtask] Design and implement the column storage layout for columns. [gravitino]

2024-10-15 Thread via GitHub


jerryshao closed issue #4867: [Subtask] Design and implement the column storage 
layout for columns.
URL: https://github.com/apache/gravitino/issues/4867


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Bug report] Display users Java API on playground fails [gravitino]

2024-10-15 Thread via GitHub


justinmclean closed issue #5132: [Bug report] Display users Java API on 
playground fails
URL: https://github.com/apache/gravitino/issues/5132


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Improvement] Possble null pointer exception in DorisTableOperations.java [gravitino]

2024-10-15 Thread via GitHub


justinmclean closed issue #4605: [Improvement] Possble null pointer exception 
in DorisTableOperations.java
URL: https://github.com/apache/gravitino/issues/4605


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


yuqi1129 commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1802424734


##
clients/filesystem-hadoop3/src/main/java/org/apache/gravitino/filesystem/hadoop/GravitinoVirtualFileSystem.java:
##
@@ -125,6 +132,10 @@ public void initialize(URI name, Configuration 
configuration) throws IOException
 
 initializeClient(configuration);
 
+// Register the default local and HDFS FileSystemProvider
+String fileSystemProviders = configuration.get(FS_FILESYSTEM_PROVIDERS);
+
fileSystemProvidersMap.putAll(FileSystemUtils.getFileSystemProviders(fileSystemProviders));

Review Comment:
   Some existing IT has checked the current code, moreover, I have verified 
this logic when developing python GVFS client.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1802363915


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/fs/FileSystemUtils.java:
##
@@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.catalog.hadoop.fs;
+
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.util.Arrays;
+import java.util.Map;
+import java.util.ServiceLoader;
+import java.util.Set;
+
+public class FileSystemUtils {
+
+  private FileSystemUtils() {}
+
+  public static Map getFileSystemProviders(String 
fileSystemProviders) {
+Map resultMap = Maps.newHashMap();
+ServiceLoader allFileSystemProviders =
+ServiceLoader.load(FileSystemProvider.class);
+
+Set providersInUses =
+fileSystemProviders != null
+? Arrays.stream(fileSystemProviders.split(","))
+.map(String::trim)
+.collect(java.util.stream.Collectors.toSet())
+: Sets.newHashSet();
+// Always add the built-in LocalFileSystemProvider and 
HDFSFileSystemProvider to the catalog.
+providersInUses.add(LocalFileSystemProvider.class.getSimpleName());
+providersInUses.add(HDFSFileSystemProvider.class.getSimpleName());
+
+allFileSystemProviders.forEach(
+fileSystemProvider -> {
+  if 
(providersInUses.contains(fileSystemProvider.getClass().getSimpleName())) {

Review Comment:
   So do we still need to configure the class names for fs-providers and 
default-fs-provider?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1802413137


##
docs/hadoop-catalog.md:
##
@@ -25,16 +25,19 @@ Hadoop 3. If there's any compatibility issue, please create 
an [issue](https://g
 
 Besides the [common catalog 
properties](./gravitino-server-config.md#gravitino-catalog-properties-configuration),
 the Hadoop catalog has the following properties:
 
-| Property Name  | Description 
   | 
Default Value | Required| 
Since Version |
-|||---|-|---|
-| `location` | The storage location 
managed by Hadoop catalog.| 
(none)| No  | 
0.5.0 |
-| `authentication.impersonation-enable`  | Whether to enable 
impersonation for the Hadoop catalog.| 
`false`   | No  | 
0.5.1 |
-| `authentication.type`  | The type of 
authentication for Hadoop catalog, currently we only support `kerberos`, 
`simple`. | `simple`  | No  
| 0.5.1 |
-| `authentication.kerberos.principal`| The principal of the 
Kerberos authentication   | 
(none)| required if the value of `authentication.type` is Kerberos. | 
0.5.1 |
-| `authentication.kerberos.keytab-uri`   | The URI of The keytab 
for the Kerberos authentication. | 
(none)| required if the value of `authentication.type` is Kerberos. | 
0.5.1 |
-| `authentication.kerberos.check-interval-sec`   | The check interval of 
Kerberos credential for Hadoop catalog.  | 60   
 | No  | 0.5.1  
   |
-| `authentication.kerberos.keytab-fetch-timeout-sec` | The fetch timeout of 
retrieving Kerberos keytab from `authentication.kerberos.keytab-uri`. | 60  
  | No  | 0.5.1 
|
-
+| Property Name  | Description 



   | Default Value   | Required 
   | Since Version |
+|||-|-|---|
+| `location` | The storage location 
managed by Hadoop catalog.  


  | (none)  | No
  | 0.5.0 |
+| `filesystem-providers` | The names (split by 
comma) of filesystem providers for the Hadoop catalog. Gravitino already 
support built-in `builtin-local`(`local file`) and `builtin-hdfs`(`hdfs`). If 
users want to support more file system and add it to Gravitino, they custom 
more file system by implementing `FileSystemProvider`.  | (none)  | No  
| 0.7.0 |
+| `default-filesystem-provider`  | The name default 
filesystem providers of this Hadoop catalog if users do not specify the scheme 
in the URI. Default value is `builtin-local`

   | `builtin-local` | No   
   | 0.7.0 |

Review 

Re: [PR] [#5059] Add commands to create, delete and modify metalakes, catalogs and schema. [gravitino]

2024-10-15 Thread via GitHub


lisancao commented on code in PR #5117:
URL: https://github.com/apache/gravitino/pull/5117#discussion_r1802247505


##
clients/cli/errors.sh:
##
@@ -0,0 +1,63 @@
+#!/bin/bash
+
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+# Some of these examples assume you have the Apache Gravitino playground 
running.
+
+alias gcli='java -jar 
clients/cli/build/libs/gravitino-cli-0.7.0-incubating-SNAPSHOT.jar'
+
+# No such command
+gcli unknown
+
+# unknown command and entiry

Review Comment:
   entity? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5086] core(feat): Add logic to support manipulating columns in TableOperationDispatcher [gravitino]

2024-10-15 Thread via GitHub


jerqi commented on code in PR #5127:
URL: https://github.com/apache/gravitino/pull/5127#discussion_r1802261172


##
core/src/main/java/org/apache/gravitino/catalog/EntityCombinedTable.java:
##
@@ -128,6 +128,14 @@ public boolean imported() {
 return imported;
   }
 
+  public Table tableFromCatalog() {

Review Comment:
   How about `tableFromExternalCatalog`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Bug report] OB catalog UTs failed on Mac [gravitino]

2024-10-15 Thread via GitHub


mchades opened a new issue, #5152:
URL: https://github.com/apache/gravitino/issues/5152

   ### Version
   
   main branch
   
   ### Describe what's wrong
   
   OB catalog UTs failed on Mac
   
   ### Error message and/or stacktrace
   
   The OB Docker log can be found:
   
   > [ERROR] OBD-1007: (192.168.215.2) The value of the ulimit parameter "max 
user processes" must not be less than 12 (Current value: 56081), Please 
execute `echo -e "* soft nproc 12\n* hard nproc 12" >> 
/etc/security/limits.d/nproc.conf` as root in 192.168.215.2. if it dosen't 
work, please check whether UsePAM is yes in /etc/ssh/sshd_config.
   
   
   ### How to reproduce
   
   pull the main branch and run `./gradlew 
:catalogs:catalog-jdbc-oceanbase:test -PskipDockerTests=false` on MacOS
   
   ### Additional context
   
   modify code below of 
`org.apache.gravitino.integration.test.container.BaseContainer` can fix.
   ```java
   this.container =
   new GenericContainer<>(requireNonNull(image, "image is null"))
   .withCreateContainerCmdModifier(
   cmd ->
   cmd.getHostConfig()
   .withSysctls(
   
Collections.singletonMap("net.ipv4.ip_local_port_range", "2 4"))
   .withUlimits(new Ulimit[] {new Ulimit("nproc", 
12L, 12L)}));
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5059] Add commands to create, delete and modify metalakes, catalogs and schema. [gravitino]

2024-10-15 Thread via GitHub


lisancao commented on code in PR #5117:
URL: https://github.com/apache/gravitino/pull/5117#discussion_r1802247718


##
clients/cli/errors.sh:
##
@@ -0,0 +1,63 @@
+#!/bin/bash
+
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+# Some of these examples assume you have the Apache Gravitino playground 
running.
+
+alias gcli='java -jar 
clients/cli/build/libs/gravitino-cli-0.7.0-incubating-SNAPSHOT.jar'
+
+# No such command
+gcli unknown
+
+# unknown command and entiry
+gcli unknown unknown
+
+# unknown command and entiry

Review Comment:
   entity*



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Bug report] OB catalog UTs failed on Mac [gravitino]

2024-10-15 Thread via GitHub


unfixa1 commented on issue #5152:
URL: https://github.com/apache/gravitino/issues/5152#issuecomment-2415639042

   This can be fixed by modifying the code in 
org.apache.gravitino.integration.test.container.BaseContainer :
   this.container =
   new GenericContainer<>(requireNonNull(image, "image is null"))
   .withCreateContainerCmdModifier(
   cmd ->
   cmd.getHostConfig()
   .withSysctls(
   
Collections.singletonMap("net.ipv4.ip_local_port_range", "2 4"))
   .withUlimits(new Ulimit[] {new Ulimit("nproc", 12L, 
12L)}));
   
   
   In addition, you should use [Servbay](servbay.com) to check your development 
environment, which is more useful than Docker on Mac.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [SIP] support Iceberg event listener [gravitino]

2024-10-15 Thread via GitHub


FANNG1 closed pull request #5147: [SIP]  support Iceberg event listener
URL: https://github.com/apache/gravitino/pull/5147


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [Bug report] OB catalog UTs failed on Mac [gravitino]

2024-10-15 Thread via GitHub


mchades commented on issue #5152:
URL: https://github.com/apache/gravitino/issues/5152#issuecomment-2415666256

   
   
   
   
   > In addition, you should use [Servbay](servbay.com) to check your 
development environment, which is more useful than Docker on Mac.
   
   Hi @unfixa1 , can you explain more? The link to Servbay seems broken


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [FEATURE] support pre-hook when operating a resource [gravitino]

2024-10-15 Thread via GitHub


FANNG1 commented on issue #4028:
URL: https://github.com/apache/gravitino/issues/4028#issuecomment-2415672611

   https://github.com/apache/gravitino/pull/5110 adds the basic pre event 
framework, but Gravitino doesn't dispatch actual pre event to the event bus. 
related issues tracked in #5049 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [4958] feat(iceberg): support event listener for Iceberg REST server [gravitino]

2024-10-15 Thread via GitHub


FANNG1 commented on PR #5002:
URL: https://github.com/apache/gravitino/pull/5002#issuecomment-2415676696

   @jerryshao  please help to review when you have time, thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


yuqi1129 commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1802325120


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/fs/FileSystemUtils.java:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.catalog.hadoop.fs;
+
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.util.Arrays;
+import java.util.Map;
+import java.util.ServiceLoader;
+import java.util.Set;
+
+public class FileSystemUtils {
+
+  private FileSystemUtils() {}
+
+  public static Map getFileSystemProviders(String 
fileSystemProviders) {
+Map resultMap = Maps.newHashMap();
+ServiceLoader allFileSystemProviders =
+ServiceLoader.load(FileSystemProvider.class);

Review Comment:
   ok.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1802362158


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogPropertiesMetadata.java:
##
@@ -44,8 +62,24 @@ public class HadoopCatalogPropertiesMetadata extends 
BaseCatalogPropertiesMetada
   false /* immutable */,
   null,
   false /* hidden */))
+  .put(
+  FILESYSTEM_PROVIDERS,
+  PropertyEntry.stringOptionalPropertyEntry(
+  FILESYSTEM_PROVIDERS,
+  "The file system provider names, separated by comma",
+  false /* immutable */,
+  null,
+  false /* hidden */))
+  .put(
+  DEFAULT_FS_PROVIDER,
+  PropertyEntry.stringOptionalPropertyEntry(
+  DEFAULT_FS_PROVIDER,
+  "Default file system provider name",
+  false /* immutable */,
+  LocalFileSystemProvider.class.getSimpleName(),

Review Comment:
   Should it be "builtin-local" as a default value for this property?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1802263646


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/fs/FileSystemProvider.java:
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.gravitino.catalog.hadoop.fs;
+
+import java.io.IOException;
+import java.util.Map;
+import javax.annotation.Nonnull;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+
+/**
+ * FileSystemProvider is an interface for providing FileSystem instances. It 
is used by the
+ * HadoopCatalog to create FileSystem instances for accessing Hadoop 
compatible file systems.
+ */
+public interface FileSystemProvider {
+
+  /**
+   * Get the FileSystem instance according to the configuration map and file 
path.
+   *
+   * Compared to the {@link FileSystem#get(Configuration)} method, this 
method allows the
+   * provider to create a FileSystem instance with a specific configuration 
and do further
+   * initialization if needed.
+   *
+   * For example: 1. We can check the endpoint value validity for 
S3AFileSystem then do further
+   * actions. 2. We can also change some default behavior of the FileSystem 
initialization process
+   * 3. More...
+   *
+   * @param config The configuration for the FileSystem instance.
+   * @param path The path to the file system.
+   * @return The FileSystem instance.
+   * @throws IOException If the FileSystem instance cannot be created.
+   */
+  FileSystem getFileSystem(@Nonnull Path path, @Nonnull Map 
config)
+  throws IOException;
+
+  /**
+   * Scheme of this FileSystem provider. The value is 'file' for 
LocalFileSystem, 'hdfs' for HDFS,
+   * etc.
+   *
+   * @return The scheme of this FileSystem provider used.
+   */
+  String scheme();
+
+  /**
+   * Name of this FileSystem provider. The value is 'LocalFileSystemProvider' 
for LocalFileSystem,
+   * 'HDFSFileSystemProvider' for HDFS, etc.

Review Comment:
   The name should be simple like "local", "hdfs", also it should be case 
insensitive, that would be better.



##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -742,4 +752,25 @@ private boolean checkSingleFile(Fileset fileset) {
   fileset.name());
 }
   }
+
+  FileSystem getFileSystem(Path path, Map config) throws 
IOException {
+if (path == null) {
+  throw new IllegalArgumentException("Path should not be null");
+}
+
+String scheme =
+path.toUri().getScheme() != null
+? path.toUri().getScheme()
+: defaultFileSystemProvider.scheme();
+
+FileSystemProvider provider = fileSystemProvidersMap.get(scheme);
+if (provider == null) {
+  throw new IllegalArgumentException(
+  String.format(
+  "Unsupported scheme: %s, path: %s, all supported scheme: %s and 
provider: %s",
+  scheme, path, fileSystemProvidersMap.keySet(), 
fileSystemProvidersMap.values()));

Review Comment:
   Do you make sure that "keySet" and "values" can be correctly printed out?



##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/fs/FileSystemUtils.java:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.catalog.hadoop

Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


yuqi1129 commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r180238


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -742,4 +752,25 @@ private boolean checkSingleFile(Fileset fileset) {
   fileset.name());
 }
   }
+
+  FileSystem getFileSystem(Path path, Map config) throws 
IOException {
+if (path == null) {
+  throw new IllegalArgumentException("Path should not be null");
+}
+
+String scheme =
+path.toUri().getScheme() != null
+? path.toUri().getScheme()
+: defaultFileSystemProvider.scheme();
+
+FileSystemProvider provider = fileSystemProvidersMap.get(scheme);
+if (provider == null) {
+  throw new IllegalArgumentException(
+  String.format(
+  "Unsupported scheme: %s, path: %s, all supported scheme: %s and 
provider: %s",
+  scheme, path, fileSystemProvidersMap.keySet(), 
fileSystemProvidersMap.values()));

Review Comment:
   Yes, I have tested it. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


yuqi1129 commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1802324737


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/fs/FileSystemUtils.java:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.catalog.hadoop.fs;
+
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.util.Arrays;
+import java.util.Map;
+import java.util.ServiceLoader;
+import java.util.Set;
+
+public class FileSystemUtils {
+
+  private FileSystemUtils() {}
+
+  public static Map getFileSystemProviders(String 
fileSystemProviders) {
+Map resultMap = Maps.newHashMap();
+ServiceLoader allFileSystemProviders =
+ServiceLoader.load(FileSystemProvider.class);
+
+Set providersInUses =
+fileSystemProviders != null
+? Arrays.stream(fileSystemProviders.split(","))
+.map(String::trim)
+.collect(java.util.stream.Collectors.toSet())
+: Sets.newHashSet();
+
+// Always add the built-in LocalFileSystemProvider and 
HDFSFileSystemProvider to the catalog.
+providersInUses.add(LocalFileSystemProvider.class.getSimpleName());
+providersInUses.add(HDFSFileSystemProvider.class.getSimpleName());
+
+allFileSystemProviders.forEach(
+fileSystemProvider -> {
+  if 
(providersInUses.contains(fileSystemProvider.getClass().getSimpleName())) {
+if (resultMap.containsKey(fileSystemProvider.scheme())) {
+  throw new UnsupportedOperationException(
+  String.format(
+  "File system provider with scheme '%s' already exists in 
the use provider list "
+  + "Please make sure the file system provider scheme 
is unique.",
+  fileSystemProvider.name()));
+}
+
+resultMap.put(fileSystemProvider.scheme(), fileSystemProvider);
+  }
+});
+
+return resultMap;
+  }
+
+  public static FileSystemProvider getFileSystemProviderByName(
+  Map fileSystemProviders, String 
defaultFileSystemProvider) {
+return fileSystemProviders.entrySet().stream()
+.filter(entry -> 
entry.getValue().name().equals(defaultFileSystemProvider))

Review Comment:
   Here `defaultFileSystemProvider` is the provider `name`, I want to get the 
provider by provider name.  I will change it to the `providerName`.



##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/fs/HDFSFileSystemProvider.java:
##
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.catalog.hadoop.fs;
+
+import static org.apache.gravitino.connector.BaseCatalog.CATALOG_BYPASS_PREFIX;
+
+import java.io.IOException;
+import java.util.Map;
+import javax.annotation.Nonnull;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hdfs.DistributedFileSystem;
+
+public class HDFSFileSystemProvider implements FileSystemProvider {
+
+  @Override
+  public FileSystem getFileSystem(@Nonnull Path path, @Nonnull Map config)
+  throws IOException {
+Configuration configuration = new Configuration();
+config.forEach(
+(k, v) -> {
+  configuration.set(k.replace(CATALOG_BYPASS_PREFIX, ""), v);
+

Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


yuqi1129 commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1802325120


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/fs/FileSystemUtils.java:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.catalog.hadoop.fs;
+
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.util.Arrays;
+import java.util.Map;
+import java.util.ServiceLoader;
+import java.util.Set;
+
+public class FileSystemUtils {
+
+  private FileSystemUtils() {}
+
+  public static Map getFileSystemProviders(String 
fileSystemProviders) {
+Map resultMap = Maps.newHashMap();
+ServiceLoader allFileSystemProviders =
+ServiceLoader.load(FileSystemProvider.class);

Review Comment:
   ok.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1802395715


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/fs/FileSystemUtils.java:
##
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.catalog.hadoop.fs;
+
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.util.Arrays;
+import java.util.Map;
+import java.util.ServiceLoader;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+public class FileSystemUtils {
+
+  private FileSystemUtils() {}
+
+  public static Map getFileSystemProviders(String 
fileSystemProviders) {
+Map resultMap = Maps.newHashMap();
+ServiceLoader allFileSystemProviders =
+ServiceLoader.load(FileSystemProvider.class);
+
+Set providersInUses =
+fileSystemProviders != null
+? Arrays.stream(fileSystemProviders.split(","))
+.map(String::trim)
+.collect(java.util.stream.Collectors.toSet())
+: Sets.newHashSet();
+
+// Only get the file system providers that are in the use list.
+allFileSystemProviders.forEach(
+fileSystemProvider -> {
+  if (providersInUses.contains(fileSystemProvider.name())) {
+if (resultMap.containsKey(fileSystemProvider.scheme())) {
+  throw new UnsupportedOperationException(
+  String.format(
+  "File system provider with scheme '%s' already exists in 
the use provider list "
+  + "Please make sure the file system provider scheme 
is unique.",
+  fileSystemProvider.name()));
+}
+resultMap.put(fileSystemProvider.scheme(), fileSystemProvider);
+  }
+});
+
+// Always add the built-in LocalFileSystemProvider and 
HDFSFileSystemProvider to the catalog.
+FileSystemProvider builtInLocalFileSystemProvider = new 
LocalFileSystemProvider();
+FileSystemProvider builtInHDFSFileSystemProvider = new 
HDFSFileSystemProvider();
+resultMap.put(builtInLocalFileSystemProvider.scheme(), 
builtInLocalFileSystemProvider);
+resultMap.put(builtInHDFSFileSystemProvider.scheme(), 
builtInHDFSFileSystemProvider);

Review Comment:
   They will be initialized by service loader, do you have to new these classes 
again?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1802397515


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/fs/FileSystemUtils.java:
##
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.catalog.hadoop.fs;
+
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.util.Arrays;
+import java.util.Map;
+import java.util.ServiceLoader;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+public class FileSystemUtils {
+
+  private FileSystemUtils() {}
+
+  public static Map getFileSystemProviders(String 
fileSystemProviders) {
+Map resultMap = Maps.newHashMap();
+ServiceLoader allFileSystemProviders =
+ServiceLoader.load(FileSystemProvider.class);
+
+Set providersInUses =
+fileSystemProviders != null
+? Arrays.stream(fileSystemProviders.split(","))
+.map(String::trim)
+.collect(java.util.stream.Collectors.toSet())
+: Sets.newHashSet();

Review Comment:
   You can also add "local" and "hdfs" in this set to simplify the code below.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1802404867


##
clients/filesystem-hadoop3/src/main/java/org/apache/gravitino/filesystem/hadoop/GravitinoVirtualFileSystem.java:
##
@@ -125,6 +132,10 @@ public void initialize(URI name, Configuration 
configuration) throws IOException
 
 initializeClient(configuration);
 
+// Register the default local and HDFS FileSystemProvider
+String fileSystemProviders = configuration.get(FS_FILESYSTEM_PROVIDERS);

Review Comment:
   Do you also need a default fileSystemProvider for gvfs?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1802412338


##
clients/filesystem-hadoop3/src/main/java/org/apache/gravitino/filesystem/hadoop/GravitinoVirtualFileSystemConfiguration.java:
##
@@ -32,6 +35,18 @@ class GravitinoVirtualFileSystemConfiguration {
   /** The configuration key for the Gravitino client auth type. */
   public static final String FS_GRAVITINO_CLIENT_AUTH_TYPE_KEY = 
"fs.gravitino.client.authType";
 
+  /**
+   * Full class name of file systems that implement {@link 
FileSystemProvider}` spilt by a comma.
+   *
+   * This configuration is used to register file system providers to the 
gvfs file system. For
+   * example:
+   *
+   * 
+   * XFileSystemProvider, FileSystemProvider

Review Comment:
   Is the description correct?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5019] feat: (hadoop-catalog): Add a framework to support multi-storage in a pluggable manner for fileset catalog [gravitino]

2024-10-15 Thread via GitHub


jerryshao commented on code in PR #5020:
URL: https://github.com/apache/gravitino/pull/5020#discussion_r1802392106


##
clients/filesystem-hadoop3/src/main/java/org/apache/gravitino/filesystem/hadoop/GravitinoVirtualFileSystem.java:
##
@@ -125,6 +132,10 @@ public void initialize(URI name, Configuration 
configuration) throws IOException
 
 initializeClient(configuration);
 
+// Register the default local and HDFS FileSystemProvider
+String fileSystemProviders = configuration.get(FS_FILESYSTEM_PROVIDERS);
+
fileSystemProvidersMap.putAll(FileSystemUtils.getFileSystemProviders(fileSystemProviders));

Review Comment:
   gvfs will depend on the hadoop-catalog, can you please check the 
dependencies of gvfs to make sure the shading jar is correct?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5059] Add commands to create, delete and modify metalakes, catalogs and schema. [gravitino]

2024-10-15 Thread via GitHub


justinmclean commented on code in PR #5117:
URL: https://github.com/apache/gravitino/pull/5117#discussion_r1802426110


##
clients/cli/errors.sh:
##
@@ -0,0 +1,63 @@
+#!/bin/bash
+
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+# Some of these examples assume you have the Apache Gravitino playground 
running.
+
+alias gcli='java -jar 
clients/cli/build/libs/gravitino-cli-0.7.0-incubating-SNAPSHOT.jar'
+
+# No such command
+gcli unknown
+
+# unknown command and entiry
+gcli unknown unknown
+
+# unknown command and entiry

Review Comment:
   thanks fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [#5086] core(feat): Add logic to support manipulating columns in TableOperationDispatcher [gravitino]

2024-10-15 Thread via GitHub


jerqi commented on PR #5127:
URL: https://github.com/apache/gravitino/pull/5127#issuecomment-2415831995

   Column order may be important, too. `A int, B int` is different from `B int, 
A int`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Bug report] Doc: align the gravitino type in trino-connector doc with programming guides doc [gravitino]

2024-10-15 Thread via GitHub


danhuawang opened a new issue, #5150:
URL: https://github.com/apache/gravitino/issues/5150

   ### Version
   
   main branch
   
   ### Describe what's wrong
   
   **The gravitino type in trino-connector doc is the Java Type description**
   https://github.com/user-attachments/assets/980e7712-58a5-4b94-b24c-f08ddd5e2f71";>
   
   **Gravitino table column type description in programming guides doc:**
   https://github.com/user-attachments/assets/200af420-6f8d-4160-96f2-46cf55121add";>
   
   
   ### Error message and/or stacktrace
   
   N/A
   
   ### How to reproduce
   
   1. Trino connector doc: 
https://gravitino.apache.org/docs/0.6.0-incubating/trino-connector/supported-catalog#data-type-mapping-between-trino-and-apache-gravitino
   2. Apache Gravitino table column type: 
https://gravitino.apache.org/docs/0.6.0-incubating/manage-relational-metadata-using-gravitino#apache-gravitino-table-column-type
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [IMPROVEMENT] fix: link to documentation website in README [gravitino]

2024-10-15 Thread via GitHub


SeanAverS opened a new pull request, #5149:
URL: https://github.com/apache/gravitino/pull/5149

   
   
   ### What changes were proposed in this pull request?
   
   - Link to documentation website 
   
   ### Why are the changes needed?
   
   - Link to documentation website instead of documentation directory for 
easier access 
   
   Fix: #5109 
   
   ### Does this PR introduce _any_ user-facing change?
   
   - No user-facing changes 
   
   ### How was this patch tested?
   
   - Link to website shows in README upon updating text 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Bug report] Tagging an entity with an unknown tag does nothing [gravitino]

2024-10-15 Thread via GitHub


justinmclean opened a new issue, #5148:
URL: https://github.com/apache/gravitino/issues/5148

   ### Version
   
   main branch
   
   ### Describe what's wrong
   
   When calling `associateTags` with an unknown tag, the tag will not be added 
to the entity and no error occurs. I would expect a NoSuchTag exception to be 
raised.
   
   ### Error message and/or stacktrace
   
   N/A
   
   ### How to reproduce
   
   Call associateTags with a tag name that has not been created in that 
metalake.
   
   ### Additional context
   
   Same happens when removing tags.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org