urosstan-db commented on code in PR #50886:
URL: https://github.com/apache/spark/pull/50886#discussion_r2092569820


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableSummary.java:
##########
@@ -0,0 +1,30 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+

Review Comment:
   Yeah, adding new fields would be breaking change for object creation, but 
not for `TableCatalog`.
   Adding to many fields can make this API more expensive, not only because of 
fetch time from remote catalogs/servers, that is not so important, but because 
certain implementors may not return all fields by their `listTableSummaries` 
implementations (e.g. JDBC API). 
   Which options do you suggest to add?



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableCatalog.java:
##########
@@ -103,6 +105,23 @@ public interface TableCatalog extends CatalogPlugin {
    */
   Identifier[] listTables(String[] namespace) throws NoSuchNamespaceException;
 
+  /**
+   * List the table summaries in a namespace from the catalog.
+   * <p>
+   * This method should return all tables entities from a catalog regardless 
of type (i.e. views
+   * should be listed as well).
+   *
+   * @param namespace a multi-part namespace
+   * @return an array of Identifiers for tables
+   * @throws NoSuchNamespaceException If the namespace does not exist 
(optional).
+   */
+  default TableSummary[] listTableSummaries(String[] namespace) throws 
NoSuchNamespaceException {
+    // By default, we assume that all tables have standard table type.

Review Comment:
   Good question, I think we can add a table pattern as well. In case some 
implementor does not support RPC with table pattern specification, then 
implementor would need to do filtering on Spark side.
   @cloud-fan What do you think about that?
   We have to define what is pattern as well, and to define some Util methods 
which can be used for filtering on Spark side in case pattern format is not 
compliant with pattern format of implementor and filtering can't be done.
   
   Overall, I am not sure how often we may benefit from providing of patterns, 
usually DSv2 interface have exact names, not patterns, but I think it strictly 
better to add it if we define util method I mentioned.



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableCatalog.java:
##########
@@ -103,6 +105,23 @@ public interface TableCatalog extends CatalogPlugin {
    */
   Identifier[] listTables(String[] namespace) throws NoSuchNamespaceException;
 
+  /**
+   * List the table summaries in a namespace from the catalog.
+   * <p>
+   * This method should return all tables entities from a catalog regardless 
of type (i.e. views
+   * should be listed as well).
+   *
+   * @param namespace a multi-part namespace
+   * @return an array of Identifiers for tables
+   * @throws NoSuchNamespaceException If the namespace does not exist 
(optional).
+   */
+  default TableSummary[] listTableSummaries(String[] namespace) throws 
NoSuchNamespaceException {
+    // By default, we assume that all tables have standard table type.

Review Comment:
   Good question, I think we can add a table pattern as well. In case some 
implementor does not support RPC with table pattern specification, then 
implementor would need to do filtering on Spark side.
   @cloud-fan What do you think about that?
   We have to define what is pattern as well, and to define some Util methods 
which can be used for filtering on Spark side in case pattern format is not 
compliant with pattern format of implementor and filtering can't be done on 
remote system.
   
   Overall, I am not sure how often we may benefit from providing of patterns, 
usually DSv2 interface have exact names, not patterns, but I think it strictly 
better to add it if we define util method I mentioned.



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableCatalog.java:
##########
@@ -103,6 +105,23 @@ public interface TableCatalog extends CatalogPlugin {
    */
   Identifier[] listTables(String[] namespace) throws NoSuchNamespaceException;
 
+  /**
+   * List the table summaries in a namespace from the catalog.
+   * <p>
+   * This method should return all tables entities from a catalog regardless 
of type (i.e. views
+   * should be listed as well).
+   *
+   * @param namespace a multi-part namespace
+   * @return an array of Identifiers for tables
+   * @throws NoSuchNamespaceException If the namespace does not exist 
(optional).
+   */
+  default TableSummary[] listTableSummaries(String[] namespace) throws 
NoSuchNamespaceException {
+    // By default, we assume that all tables have standard table type.

Review Comment:
   Good question, I think we can add a table pattern as well. In case some 
implementor does not support RPC with table pattern specification, then 
implementor would need to do filtering on Spark side.
   @cloud-fan What do you think about that?
   We have to define what is pattern as well, and to define some Util methods 
which can be used for filtering on Spark side in case pattern format is not 
compliant with pattern format of implementor and filtering can't be done on 
remote system.
   
   Overall, I am not sure how often we may benefit from providing of patterns, 
usually DSv2 interface have exact names, not patterns, but I think it is 
strictly better to add it if we define util method I mentioned.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to