yangguoaws commented on code in PR #49276:
URL: https://github.com/apache/spark/pull/49276#discussion_r1944674922


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala:
##########
@@ -126,7 +126,9 @@ class CacheManager extends Logging with 
AdaptiveSparkPlanHelper {
     if (storageLevel == StorageLevel.NONE) {
       // Do nothing for StorageLevel.NONE since it will not actually cache any 
data.
     } else if (lookupCachedDataInternal(normalizedPlan).nonEmpty) {
-      logWarning("Asked to cache already cached data.")
+      logWarning(log"An attempt was made to cache data even though the data 
had already been " +

Review Comment:
   @gengliangwang This code change is to provide added-value for warning log as 
developers can easily identify which query_plan was already persisted.
   
   - Before the change: The warning log only showed `Asked to cache already 
cached data`. Developers can not identify which query_plan was already cached 
from the warning message. For large project, it means the warning does not add 
value to the user as there might be too many dataframe in the project.
   
   - After the change: The warning log showed which query_plan was already 
cached. Then developers can easily check their code to identify the unnecessary 
cache/persist for specific dataframe.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to