vrozov opened a new pull request, #49276: URL: https://github.com/apache/spark/pull/49276
### What changes were proposed in this pull request? The change improves warning logging in the CacheManager by: 1. Adds logical plan info to the existing warning messages. 2. Logs warning message in case an attempt is made to remove data from the cache, but data is not present. ### Why are the changes needed? The change helps to identify incorrect calls to `Dataset.persist()` and `Dataset.unpersist()` as in ``` Dataset<Row> dataset = ... Dataset<Row> dataset1 = dataset.withColumn(...); Dataset<Row> dataset2 = dataset1.withColumn(...); dataset.persist(); // OK dataset1.persist(); // OK dataset.persist(); // currently logs warning without logical plan details dataset.unpersist(); // OK dataset.unpersist(); // no warning dataset2.unpersist(); // no warning, the actual call should be on dataset1 ``` ### Does this PR introduce _any_ user-facing change? Users may see warning messages like: ``` 23.12.2024 19:15:03.840 WARN [pool-30-thread-1] org.apache.spark.sql.execution.CacheManager - An attempt was made to cache data even though the data had already been cached. Please un-cache data or clear cache first. Logical plan: Relation [i#0] JDBCRelation(test_table) [numPartitions=1] ``` and ``` 23.12.2024 19:15:04.207 WARN [pool-30-thread-1] org.apache.spark.sql.execution.CacheManager - Data has not been previously cached or it was removed from the cache already. Logical plan: Project [i#0, i#0 AS year#6] +- Relation [i#0] JDBCRelation(test_table) [numPartitions=1] ``` ### How was this patch tested? The change modifies warning log messages. ### Was this patch authored or co-authored using generative AI tooling? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org