----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26263/ -----------------------------------------------------------
Review request for cloudstack, edison su and Min Chen. Summary (updated) ----------------- CLOUDSTACK-7539: [S3] Parallel deployment makes reference count of a cache in nfs secondary staging store negative(-1) Bugs: CLOUDSTACK-7539 https://issues.apache.org/jira/browse/CLOUDSTACK-7539 Repository: cloudstack-git Description (updated) ------- Root causes of CLOUDSTACK-7539 are below. a. createCacheObject() method lacks mutual exclusion. After calling this method, threads check cache existence but the procedure is not atomic. So, other threads may call this method and decide to create a cache in spite of cache copy being in progress. b. createCacheObject() method ignores states except READY when it checks cache existence. Then, it is considered cache doesn't exist in spite of cache copy being in progress. c. When multiple cache entries about a same image are in a database, threads may get a wrong cache entry because of incorrect search condition that randomly chooses one entry from entries that have same template id. As a result of this retrieval, reference counter of the cache probably go negative. (I'm not sure about this behavior but I suspect CloudStack behaves like this.) These behaviors make cache state incorrect and inconsistent. For fixing this bug, I modified cache creation behavior when no cache is on NFS Secondary Staging Store(SSS). As sequential deployment functions so far, I think parallel deployment works well when it is managed like sequential deployment. Fixed points are below. These are for cause a. and b., not including fix for cause c. 1. Protect critical section between cache search and allocation by lock. When management server threads reach createCacheObject() method, threads get lock at first and check cache existence. In case that no cache is on NFS SSS, one thread sends a request for creating cache. In other case, threads wait for completion of creating cache not to duplicate same image. 2. Add several condition check for cache state. If a thread is creating a cache of an image, other threads mustn't send no additional request for the same image. Therefore, threads have to consider whether a copy request should be issued by those state check. Diffs (updated) ----- engine/storage/cache/src/org/apache/cloudstack/storage/cache/manager/StorageCacheManagerImpl.java 7c74729 Diff: https://reviews.apache.org/r/26263/diff/ Testing (updated) ------- I tested deployment of two instances at almost the same time. Results are below. - Succeed in deployment of two instances. - No database cache entries are duplicated. - No duplicated images are on NFS SSS. - Reference counter never goes to -1. Thanks, Hiroki Ohashi