Darren, I just checked the code, you are right. In case of one thread throws 
exception in downloading template to primary, it will delete the entry in 
template_store_ref, causing the second thread failing with NPE. We need to fix 
this in 4.3. Please file a bug for this.

Thanks
-min

From: Darren Shepherd 
<darren.s.sheph...@gmail.com<mailto:darren.s.sheph...@gmail.com>>
Date: Thursday, October 31, 2013 11:39 AM
To: "dev@cloudstack.apache.org<mailto:dev@cloudstack.apache.org>" 
<dev@cloudstack.apache.org<mailto:dev@cloudstack.apache.org>>, Min Chen 
<min.c...@citrix.com<mailto:min.c...@citrix.com>>
Subject: race conditions in VolumeServiceImpl.createBaseImageAsync() creates NPE

The following code results in a NPE in bad situations

        templatePoolRef = _tmpltPoolDao.acquireInLockTable(templatePoolRefId, 
storagePoolMaxWaitSeconds);

        if (templatePoolRef == null) {
            if (s_logger.isDebugEnabled()) {
                s_logger.info<http://s_logger.info>("Unable to acquire lock on 
VMTemplateStoragePool " + templatePoolRefId);
            }
            templatePoolRef = 
_tmpltPoolDao.findByPoolTemplate(dataStore.getId(), template.getId());
            if (templatePoolRef.getState() == 
ObjectInDataStoreStateMachine.State.Ready ) {
                s_logger.info<http://s_logger.info>("Unable to acquire lock on 
VMTemplateStoragePool " + templatePoolRefId + ", But Template " + 
template.getUniqueName() + " is already copied to primary storage, skip 
copying");
                createVolumeFromBaseImageAsync(volume, 
templateOnPrimaryStoreObj, dataStore, future);
                return;
            }
            throw new CloudRuntimeException("Unable to acquire lock on 
VMTemplateStoragePool: " + templatePoolRefId);
        }

If two threads are trying to stage the same template thread one gets the lock, 
thread two will wait.  If thread one fails to stage the template it will delete 
the templatePoolRef from the database.  Thread two will now get the lock in 
op_lock, but the internal findById will not find a templatePoolRef because it 
has been deleted and return null from acquireInLockTable().  Technically thread 
two has the lock, but the ref templatePoolRef wasn't found.  The subsequent 
line "templatePoolRef = _tmpltPoolDao.findByPoolTemplate(...)" will return 
null, because it doesn't exist and then on the next line 
templatePoolRef.getState() will throw a NPE.

Darren

Reply via email to