Thank you for the head-up with the summarization, Yaniv. > Is there a planned schedule for the removal of some of them?
In general, we are very reluctant when it comes to the removals. Even for the deprecated ones. https://spark.apache.org/versioning-policy.html It should be considered case by case instead of marking the removal plans in a batch way. In other words, I'm not sure we want to remove those deprecated ones in the next Spark releases. At least, I believe those deprecated ones will be there for the lifetime of Spark 4.x for next 5+ years. Spark 1: 2014.05 (1.0.0) ~ 2016.11 (1.6.3) Spark 2: 2016.07 (2.0.0) ~ 2021.05 (2.4.8) Spark 3: 2020.06 (3.0.0) ~ 2026.04 (3.5.x) Spark 4: 2025.02 (4.0.0) ~ 2031? Maybe, we want to revisit this topic when we start to discuss Apache Spark 5 someday. Best, Dongjoon. On 2025/06/25 17:08:58 Yaniv Kunda wrote: > Hi all, > During local builds, I've noticed there are >800 warnings about deprecated > classes/methods. > Is there a planned schedule for the removal of some of them? > > At least since Java 9, the Java `@Deprecation` annotation has a > `forRemoval` attribute, which is not used yet - but since Spark 4.x > requires Java 17, maybe it's a good time to start using it. > It's still beneficial to define some sort of an external removal > schedule/policy, whether time-based or version-based - to encourage users > to move away from the deprecated code. > (p.s., the comparable scala annotation doesn't have that attribute). > > I've processed the build output (with a bit of manual filling-ins) and > created a table with most of the deprecation warnings, sorted by # > occurrences: > +-------+----------------------------------------------------+--------------------------------------+-------+-----+---------+ > | Lang. | What | Where > | since | # | Source | > +-------+----------------------------------------------------+--------------------------------------+-------+-----+---------+ > | Scala | class StreamingContext | package > streaming | 3.4.0 | 132 | | > | Scala | JavaStreamingContext | > org.apache.spark.streaming.api.java | 3.4.0 | 58 | | > | Scala | method Once | class > Trigger | 3.4.0 | 25 | | > | Scala | class UserDefinedAggregateFunction | package > expressions | 3.0.0 | 17 | | > | Scala | object StreamingContext | package > streaming | 3.4.0 | 15 | | > | Scala | class ChiSqSelector | package > feature | 3.1.1 | 14 | | > | Scala | class JavaStreamingContext | package java > | 3.4.0 | 14 | | > | Scala | class SparkListenerExecutorBlacklisted | package > scheduler | 3.1.0 | 13 | | > | Scala | method createExternalTable | class > SQLContext | 2.2.0 | 12 | | > | Scala | method load | class > SQLContext | 1.4.0 | 12 | | > | Scala | class SparkListenerExecutorBlacklistedForStage | package > scheduler | 3.1.0 | 11 | | > | Scala | class SparkListenerExecutorUnblacklisted | package > scheduler | 3.1.0 | 11 | | > | Scala | method schema | trait Table > | 3.4.0 | 11 | | > | Scala | class SparkListenerExecutorUnblacklistedForStage | package > scheduler | 3.1.0 | 10 | | > | Scala | class SparkListenerNodeBlacklistedForStage | package > scheduler | 3.1.0 | 10 | | > | Scala | method jsonRDD | class > SQLContext | 1.4.0 | 10 | | > | Scala | value blacklistedInStages | class > ExecutorSummary | 3.1.0 | 10 | | > | Scala | class SparkListenerNodeBlacklisted | package > scheduler | 3.1.0 | 9 | | > | Scala | value isBlacklisted | class > ExecutorSummary | 3.1.0 | 9 | | > | Scala | method applySchema | class > SQLContext | 1.3.0 | 8 | | > | Scala | class SparkListenerNodeUnblacklisted | package > scheduler | 3.1.0 | 7 | | > | Scala | method toDegrees | object > functions | 2.1.0 | 7 | | > | Scala | method toRadians | object > functions | 2.1.0 | 7 | | > | Scala | method jsonFile | class > SQLContext | 1.4.0 | 6 | | > | Scala | method jdbc | class > SQLContext | 1.4.0 | 6 | | > | Scala | method registerTempTable | class > Dataset | 2.0.0 | 6 | | > | Scala | method udf | object > functions | 3.0.0 | 6 | | > | Scala | method register | class > UDFRegistration | 3.0.0 | 6 | | > | Scala | value isBlacklistedForStage | class > ExecutorStageSummary | 3.1.0 | 5 | | > | Scala | method approxCountDistinct | object > functions | 2.1.0 | 5 | | > | Scala | method explode | class > Dataset | 2.0.0 | 4 | | > | Scala | method explode | class > Dataset | 3.5.0 | 4 | | > | Java | createTable(Identifier,Column[],Transform[],Map) | TableCatalog > | 4.1.0 | 4 | | > | Java | createTable(Identifier,StructType,Transform[],Map) | TableCatalog > | 3.4.0 | 4 | | > | Java | SparkListenerExecutorBlacklisted | > org.apache.spark.scheduler | 3.1.0 | 3 | | > | Java | SparkListenerExecutorBlacklistedForStage | > org.apache.spark.scheduler | 3.1.0 | 3 | | > | Java | SparkListenerExecutorUnblacklisted | > org.apache.spark.scheduler | 3.1.0 | 3 | | > | Java | SparkListenerNodeBlacklisted | > org.apache.spark.scheduler | 3.1.0 | 3 | | > | Java | SparkListenerNodeBlacklistedForStage | > org.apache.spark.scheduler | 3.1.0 | 3 | | > | Java | SparkListenerNodeUnblacklisted | > org.apache.spark.scheduler | 3.1.0 | 3 | | > | Scala | method !== | class Column > | 2.0.0 | 3 | | > | Scala | method createExternalTable | class > Catalog | 2.2.0 | 3 | | > | Scala | method newTaskTempFile | trait > FileCommitProtocol | 3.3.0 | 2 | | > | Scala | DEPRECATED_CHILD_CONNECTION_TIMEOUT | > SparkLauncher | 3.2.0 | 2 | | > | Java | DEPRECATED_CHILD_CONNECTION_TIMEOUT | > SparkLauncher | 3.2.0 | 2 | | > | Scala | method newTaskTempFileAbsPath | trait > FileCommitProtocol | 3.3.0 | 1 | | > | Scala | value holdingLocks | class > ThreadStackTrace | 4.0.0 | 1 | | > | Java | AppStatusSource.BLACKLISTED_EXECUTORS | > AppStatusSource | 3.1.0 | 1 | | > | Java | AppStatusSource.UNBLACKLISTED_EXECUTORS | > AppStatusSource | 3.1.0 | 1 | | > | Scala | method classifyException | class > JdbcDialect | 4.0.0 | 1 | | > | Scala | string sql | message > SqlCommand | | 5 | proto | > | Scala | trait AggregationBuffer | class > GenericUDAFEvaluator | | 16 | hive | > | Scala | method initialize | class > AbstractSerDe | | 1 | hive | > | Scala | method poll(long) | trait > Consumer | | 5 | kafka | > | Scala | class DefaultPartitioner | package > internals | | 3 | kafka | > | Scala | method getAllStatistics | class > FileSystem | | 3 | hadoop | > | Java | ParquetFileReader(Path,ParquetReadOptions) | > ParquetFileReader | | 3 | parquet | > | Java | ParquetFileReader(Path,ParquetReadOptions,List) | > ParquetFileReader | | 3 | parquet | > | Java | ParquetFileReader(Configuration,List,List,boolean) | > ParquetFileReader | | 3 | parquet | > | Java | readFooter(Configuration,Path) | > ParquetFileReader | | 3 | parquet | > | Java | BytesInput toByteArray() | BytesInput > | | 1 | parquet | > | Java | AvroParquetWriter builder(Path) | > AvroParquetWriter | | 1 | parquet | > | Scala | method readAllFootersInParallel | class > ParquetFileReader | | 1 | parquet | > | Java | RandomStringUtils random(int) | > RandomStringUtils | | 3 | c-lang3 | > | Java | RandomStringUtils randomAlphabetic(int) | > RandomStringUtils | | 3 | c-lang3 | > | Scala | method randomAlphanumeric | class > RandomStringUtils | | 1 | c-lang3 | > +-------+----------------------------------------------------+--------------------------------------+-------+-----+---------+ > > Most are spark's own deprecations - per your input, I can create issues and > PRs to handle specific areas. > Regarding the 3rd-party deprecations (at the end of the table) - most > (maybe all besides parquet) would be trivial to fix, for which I can create > separate tasks. > --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org