Doesn't the migration code 'clear' the debt? The proposal is not to continue to support the config. I feel like people are not quite understanding the change, and objecting to something that doesn't exist. It's a shame, as this seems like something not even worth discussing. I don't know why this triggered this much discussion. We have kept deprecated methods without blinking, which is in comparison much bigger. Can we maybe ask you review the actual change in question?
On Mon, Mar 10, 2025, 10:02 PM Yang Jie <yangji...@apache.org> wrote: > -1 > Remove migration logic of incorrect `spark.databricks.*` configuration in > Spark 4.0.0 because I think this configuration was initially introduced > accidentally in Spark 3.5.4, lacking a clear design intent. Although the > immediate maintenance cost of retaining this configuration currently seems > limited, as subsequent versions iterate and user habits form, it may lead > to the continuous accumulation of technical debt. When users come to view > this configuration as one that can be relied on long-term, future removal > may face greater resistance from users and could potentially become an > entrenched and redundant configuration in the codebase. Therefore, promptly > correcting this historically accidental configuration not only maintains > the normativity of the Spark configuration system but also prevents > unintended configurations from becoming de facto standards, thereby > reducing long-term maintenance risks. > > Jie Yang > > On 2025/03/10 14:52:52 Dongjoon Hyun wrote: > > -1 because there exists a feasible migration path for Apache Spark 3.5.4 > via Apache Spark 3.5.5. > > > > It's obvious that this Databricks' mistake already causes a huge > communication cost in the Apache Spark community and is suggesting a burden > to enforce us to handle at least two more PRs at 4.0.0 and 4.1.0. > > > > Given that, I don't think > > - This is an inevitable or > > - This is 0 cost > > > > Dongjoon. > > > > On 2025/03/10 12:46:16 Jungtaek Lim wrote: > > > Starting from my +1 (non-binding). > > > > > > In addition, I propose to retain migration logic till Spark 4.1.x and > > > remove it in Spark 4.2.0. > > > > > > On Mon, Mar 10, 2025 at 9:44 PM Jungtaek Lim < > kabhwan.opensou...@gmail.com> > > > wrote: > > > > > > > Hi dev, > > > > > > > > Please vote to retain migration logic of incorrect > `spark.databricks.*` > > > > configuration in Spark 4.0.x. > > > > > > > > - DISCUSSION: > > > > https://lists.apache.org/thread/xzk9729lsmo397crdtk14f74g8cyv4sr > > > > ([DISCUSS] Handling spark.databricks.* config being exposed in 3.5.4 > in > > > > Spark 4.0.0+) > > > > > > > > Specifically, please review this post > > > > https://lists.apache.org/thread/xtq1kjhsl4ohfon78z3wld2hmfm78t9k > which > > > > explains pros and cons about the proposal - proposal is about > "Option 1". > > > > > > > > Simply speaking, this vote is to allow streaming queries which had > been > > > > ever run in Spark 3.5.4 to be upgraded with Spark 4.0.x, "without > having to > > > > be upgraded with Spark 3.5.5+ in prior". If the vote passes, we will > help > > > > users to have a smooth upgrade from Spark 3.5.4 to Spark 4.0.x, > which would > > > > be almost 1 year. > > > > > > > > The (only) cons in this option is having to retain the incorrect > > > > configuration name as "string" in the codebase a bit longer. The code > > > > complexity of migration logic is arguably trivial. (link > > > > < > https://github.com/apache/spark/blob/4231d58245251a34ae80a38ea4bbf7d720caa439/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/OffsetSeq.scala#L174-L183 > > > > > > ) > > > > > > > > This VOTE is for Spark 4.0.x, but if someone supports including > migration > > > > logic to be longer than Spark 4.0.x, please cast +1 here and leave > the > > > > desired last minor version of Spark to retain this migration logic. > > > > > > > > The vote is open for the next 72 hours and passes if a majority +1 > PMC > > > > votes are cast, with a minimum of 3 +1 votes. > > > > > > > > [ ] +1 Retain migration logic of incorrect `spark.databricks.*` > > > > configuration in Spark 4.0.x > > > > [ ] -1 Remove migration logic of incorrect `spark.databricks.*` > > > > configuration in Spark 4.0.0 because... > > > > > > > > Thanks! > > > > Jungtaek Lim (HeartSaVioR) > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >