yangguoaws commented on code in PR #49276: URL: https://github.com/apache/spark/pull/49276#discussion_r1944674922
########## sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala: ########## @@ -126,7 +126,9 @@ class CacheManager extends Logging with AdaptiveSparkPlanHelper { if (storageLevel == StorageLevel.NONE) { // Do nothing for StorageLevel.NONE since it will not actually cache any data. } else if (lookupCachedDataInternal(normalizedPlan).nonEmpty) { - logWarning("Asked to cache already cached data.") + logWarning(log"An attempt was made to cache data even though the data had already been " + Review Comment: @gengliangwang This code change is to provide added-value for warning log as developers can easily identify which query_plan was already persisted. - Before the change: The warning log only showed `Asked to cache already cached data`. Developers can not identify which query_plan was already cached from the warning message. For large project, it means the warning does not add value to the user as there might be too many dataframe in the project. - After the change: The warning log showed which query_plan was already cached. Then developers can easily check their code to identify the unnecessary cache/persist for specific dataframe. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org