As you can see, I've been working on Scala 2.13 support. The umbrella is https://issues.apache.org/jira/browse/SPARK-25075 I wanted to lay out status and strategy.
This will not be done for 3.0. At the least, there are a few key dependencies (Chill, Kafka) that aren't published for 2.13, and at least one change that will need removing an API deprecated as of 3.0. Realistically: maybe Spark 3.1. I don't yet think it's pressing. Making the change is difficult as it's hard to understand the extent of the necessary changes until the whole thing minimally compiles for 2.13. I have gotten essentially that far in a local clone. The good news is I don't see any obvious hard blockers, but the changes add up to thousands of line in 200+ files. What do we need to do for 3.0? any changes that entail breaking a public API, ideally. The biggest issue there comes from extensive changes to the Scala collection hierarchy mean that the types of many public APIs that return a Seq, Map, TraversableOnce, etc _will_ actually change types in 2.13 (become immutable). See: https://issues.apache.org/jira/browse/SPARK-27683 and https://issues.apache.org/jira/browse/SPARK-29292 as the main examples. In both cases, keeping the exact same public type would require much bigger changes. These are the type of changes that all applications face when migrating to 2.13 though. 2.12 and 2.13 apps were never meant to be binary-compatible. So, in both cases we're not changing these, to avoid a lot of change and parallel source trees. I _think_ we're done with any other must-do changes for 3.0, therefore. What _can_ we do for 3.0? small changes that don't affect the 2.12 build are OK, and that's what you see in pull requests going in at the moment. The big question is whether we want to do the large change for https://issues.apache.org/jira/browse/SPARK-29292 before 3.0. It will mean adding a ton of ".toSeq" and ".toMap" calls to make mutable collections immutable when passed to methods. In theory, it won't affect behavior. We'll have to see if it does in practice. The rest will have to wait until after 3.0, I believe, including even testing the 2.13 build, which will probably turn up some more issues. Thoughts on approach? --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org