Shuai Lu created SPARK-56044:
--------------------------------

             Summary: HistoryServerDiskManager does not delete app store on 
release when app is not in active map
                 Key: SPARK-56044
                 URL: https://issues.apache.org/jira/browse/SPARK-56044
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 4.1.1, 3.5.7, 4.1.0, 3.1.1, 4.2.0
            Reporter: Shuai Lu


In {{HistoryServerDiskManager.release()}}, the store directory deletion is 
gated inside an {{oldSizeOpt.foreach}} block, which only executes when the 
application is present in the in-memory {{active}} map:

{code:scala}
val oldSizeOpt = active.synchronized {
  active.remove(appId -> attemptId)
}
oldSizeOpt.foreach { oldSize =>
  val path = appStorePath(appId, attemptId)
  updateUsage(-oldSize, committed = true)
  if (path.isDirectory()) {
    if (delete) {
      deleteStore(path)   // never reached if app is not in active map
    }
    ...
  }
}
{code}

The {{active}} map is in-memory only and is empty after a History Server 
restart. When log expiration triggers {{release(appId, attemptId, delete = 
true)}} for an app that was never reopened after a restart, {{oldSizeOpt}} is 
{{None}}, the block is skipped entirely, and the on-disk store directory (.ldb 
/ .rdb) is never deleted. Over time these orphaned store directories 
accumulate, consuming disk space indefinitely.

*Fix:*
Separate the {{updateUsage}} deduction (which correctly applies only to 
actively tracked apps) from the directory operation (which should apply 
whenever the directory exists on disk). When deleting an app that was not in 
the active map, derive its size directly from disk before deducting it from 
usage to keep accounting accurate.

Steps to Reproduce:
# Start History Server with a non-trivial max disk usage setting.
# Load several applications (their .ldb/.rdb stores are created on disk).
# Close the application UIs (release without delete -- stores remain on disk).
# Restart the History Server (active map is now empty).
# Wait for or trigger log expiration cleanup.
# Observe that the .ldb/.rdb store directories are NOT deleted despite 
release(delete=true) being called.

*Expected:* Store directories are deleted when {{release(delete=true)}} is 
called.
*Actual:* Store directories are silently left on disk when the app is not in 
the active map.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to