xushiyan commented on a change in pull request #4720:
URL: https://github.com/apache/hudi/pull/4720#discussion_r820171075
##########
File path:
hudi-spark-datasource/hudi-spark3-common/src/main/scala/org/apache/spark/sql/adapter/Spark3Adapter.scala
##########
@@ -36,6 +36,9 @@ import
org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation
import org.apache.spark.sql.hudi.SparkAdapter
import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.types.DataType
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.hudi.SparkAdapter
+import org.apache.spark.sql.internal.SQLConf
Review comment:
unused import?
##########
File path: hudi-spark-datasource/README.md
##########
@@ -36,3 +36,16 @@ file that supports spark sql on spark 2.x version.
has no class since hudi only supports spark 2.4.4 version, and it acts as the
placeholder when packaging hudi-spark-bundle module.
* hudi-spark3-common is the module that contains the code that would be reused
between spark3.x versions.
* hudi-spark-common is the module that contains the code that would be reused
between spark2.x and spark3.x versions.
+
+## Description of Time Travel
+* `HoodieSpark3_2ExtendedSqlAstBuilder` have comments in the spark3.2's code
fork from `org.apache.spark.sql.catalyst.parser.AstBuilder`, and additional
`withTimeTravel` method.
+* `SqlBase.g4` have comments in the code forked from spark3.2's parser, and
add SparkSQL Syntax `TIMESTAMP AS OF` and `VERSION AS OF`.
Review comment:
can you also list down which classes/files can be removed once upgrade
to 3.3 ?
##########
File path:
hudi-spark-datasource/hudi-spark3/src/main/scala/org/apache/spark/sql/adapter/Spark3_2Adapter.scala
##########
@@ -0,0 +1,120 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.adapter
+
+import org.apache.avro.Schema
+import org.apache.hudi.Spark3RowSerDe
+import org.apache.hudi.client.utils.SparkRowSerDe
+import org.apache.hudi.spark3.internal.ReflectUtil
+import org.apache.spark.sql.avro.{HoodieAvroDeserializerTrait,
HoodieAvroSerializer, HoodieAvroSerializerTrait, Spark3HoodieAvroDeserializer}
+import org.apache.spark.sql.catalyst.analysis.UnresolvedRelation
+import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
+import org.apache.spark.sql.catalyst.expressions.{Expression, Like}
+import org.apache.spark.sql.catalyst.parser.ParserInterface
+import org.apache.spark.sql.catalyst.plans.JoinType
+import org.apache.spark.sql.catalyst.plans.logical.{InsertIntoStatement, Join,
JoinHint, LogicalPlan}
+import org.apache.spark.sql.catalyst.{AliasIdentifier, TableIdentifier}
+import org.apache.spark.sql.connector.catalog.CatalogV2Implicits._
+import org.apache.spark.sql.execution.datasources.{FilePartition,
PartitionedFile, Spark3ParsePartitionUtil, SparkParsePartitionUtil}
+import org.apache.spark.sql.hudi.SparkAdapter
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.parser.HoodieSpark3_2ExtendedSqlParser
+import org.apache.spark.sql.types.DataType
+import org.apache.spark.sql.{Row, SparkSession}
+
+/**
+ * The adapter for spark3.
Review comment:
```suggestion
* The adapter for spark3.2.
```
##########
File path:
hudi-spark-datasource/hudi-spark3/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/TimeTravelRelation.scala
##########
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.plans.logical
+
+import org.apache.spark.sql.catalyst.expressions.{Attribute, Expression}
+
+case class TimeTravelRelation(
Review comment:
this looks identical to the one in `hudi-spark`. can we deduplicate?
##########
File path: hudi-spark-datasource/hudi-spark3/pom.xml
##########
@@ -157,7 +175,7 @@
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
- <version>${spark3.version}</version>
+ <version>${spark3.2.version}</version>
Review comment:
can you clarify why this change? what if user needs to build the project
with spark 3.1.x profile?
##########
File path: pom.xml
##########
@@ -120,6 +120,8 @@
<flink.version>1.14.3</flink.version>
<spark2.version>2.4.4</spark2.version>
<spark3.version>3.2.1</spark3.version>
+ <spark3.1.version>3.1.2</spark3.1.version>
+ <spark3.2.version>3.2.1</spark3.2.version>
Review comment:
not sure why we need these new properties. `spark3.version` is always
the default and point to the latest supported spark 3. and we shall build the
project with `spark3.1.x` if we want `spark3.version` point to 3.1. can you
clarify
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]