Another related issue for backwards compatibility, In Datasource.scala
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L415-L416
Will get triggered even when the class is a Valid DatasourceV2 but being
used in a Dataso
I think those are fair concerns, I was mostly just updating tests for RC2
and adding in "append" everywhere
Code like
spark.sql(s"SELECT a, b from $ks.test1")
.write
.format("org.apache.spark.sql.cassandra")
.options(Map("table" -> "test_insert1", "keyspace" -> ks))
.save()
Now fails at
The context on this is that it was confusing that the mode changed, which
introduced different behaviors for the same user code when moving from v1
to v2. Burak pointed this out and I agree that it's weird that if your
dependency changes from v1 to v2, your compiled Spark job starts appending
inste
Hey Russell,
Great catch on the documentation. It seems out of date. I honestly am
against having different DataSources having different default SaveModes.
Users will have no clue if a DataSource implementation is V1 or V2. It
seems weird that the default value can change for something that I have
While the ScalaDocs for DataFrameWriter say
/**
* Specifies the behavior when data or table already exists. Options include:
*
* `SaveMode.Overwrite`: overwrite the existing data.
* `SaveMode.Append`: append the data.
* `SaveMode.Ignore`: ignore the operation (i.e. no-op).
* `SaveMode.Error