vinothchandar commented on a change in pull request #4955:
URL: https://github.com/apache/hudi/pull/4955#discussion_r831563020
##########
File path: .github/workflows/bot.yml
##########
@@ -16,18 +16,36 @@ jobs:
strategy:
matrix:
include:
- - scala: "scala-2.11"
- spark: "spark2"
- - scala: "scala-2.11"
- spark: "spark2,spark-shade-unbundle-avro"
- - scala: "scala-2.12"
- spark: "spark3.1.x"
- - scala: "scala-2.12"
- spark: "spark3.1.x,spark-shade-unbundle-avro"
- - scala: "scala-2.12"
- spark: "spark3"
- - scala: "scala-2.12"
- spark: "spark3,spark-shade-unbundle-avro"
+ - scalaProfile: "scala-2.11"
+ sparkProfile: "spark2"
+ sparkVersion: "2.4.4"
+
+ # Spark 3.1.x
+ - scalaProfile: "scala-2.12"
+ sparkProfile: "spark3.1.x"
+ sparkVersion: "3.1.0"
+
+ - scalaProfile: "scala-2.12"
+ sparkProfile: "spark3.1.x"
+ sparkVersion: "3.1.1"
+
+ - scalaProfile: "scala-2.12"
+ sparkProfile: "spark3.1.x"
+ sparkVersion: "3.1.2"
+
+ - scalaProfile: "scala-2.12"
+ sparkProfile: "spark3.1.x"
+ sparkVersion: "3.1.3"
+
+ # Spark 3.2.x
+ - scalaProfile: "scala-2.12"
+ sparkProfile: "spark3"
+ sparkVersion: "3.2.0"
+
+ - scalaProfile: "scala-2.12"
Review comment:
Would this be okay with gh action minutes? @xushiyan
##########
File path:
hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/hudi/SparkAdapter.scala
##########
@@ -54,6 +59,11 @@ trait SparkAdapter extends Serializable {
*/
def createAvroDeserializer(rootAvroType: Schema, rootCatalystType:
DataType): HoodieAvroDeserializer
+ /**
+ * TODO
Review comment:
docs?
##########
File path: README.md
##########
@@ -90,21 +90,14 @@ mvn clean package -DskipTests -Dspark3
mvn clean package -DskipTests -Dspark3.1.x
```
-### Build without spark-avro module
+### What about "spark-avro" module?
-The default hudi-jar bundles spark-avro module. To build without spark-avro
module, build using `spark-shade-unbundle-avro` profile
+Previously, Hudi bundles were packaging (and shading) "spark-avro" module
internally. However, due to multiple occasion
+of it being broken b/w patch versions (most recent was, b/w 3.2.0 and 3.2.1)
of Spark after substantial deliberation
+we took a decision to let go such dependency and instead simply clone the
structures we're relying on to better control
Review comment:
we can shorten this a bit and just make README have the actual steps to
do here?
```
### What about "spark-avro" module?
Hudi versions 0.11 or later, no longer required `spark-avro` to be specified
using `--packages`
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]