race conditions in VolumeServiceImpl.createBaseImageAsync() creates NPE

Darren Shepherd Thu, 31 Oct 2013 11:40:48 -0700

The following code results in a NPE in bad situations

        templatePoolRef =
_tmpltPoolDao.acquireInLockTable(templatePoolRefId,
storagePoolMaxWaitSeconds);


        if (templatePoolRef == null) {
            if (s_logger.isDebugEnabled()) {
                s_logger.info("Unable to acquire lock on
VMTemplateStoragePool " + templatePoolRefId);
            }
            templatePoolRef =
_tmpltPoolDao.findByPoolTemplate(dataStore.getId(), template.getId());
            if (templatePoolRef.getState() ==
ObjectInDataStoreStateMachine.State.Ready ) {
                s_logger.info("Unable to acquire lock on
VMTemplateStoragePool " + templatePoolRefId + ", But Template " +
template.getUniqueName() + " is already copied to primary storage, skip
copying");
                createVolumeFromBaseImageAsync(volume,
templateOnPrimaryStoreObj, dataStore, future);
                return;
            }
            throw new CloudRuntimeException("Unable to acquire lock on
VMTemplateStoragePool: " + templatePoolRefId);
        }

If two threads are trying to stage the same template thread one gets the
lock, thread two will wait.  If thread one fails to stage the template it
will delete the templatePoolRef from the database.  Thread two will now get
the lock in op_lock, but the internal findById will not find a
templatePoolRef because it has been deleted and return null from
acquireInLockTable().  Technically thread two has the lock, but the ref
templatePoolRef wasn't found.  The subsequent line "templatePoolRef =
_tmpltPoolDao.findByPoolTemplate(...)" will return null, because it doesn't
exist and then on the next line templatePoolRef.getState() will throw a NPE.

Darren

race conditions in VolumeServiceImpl.createBaseImageAsync() creates NPE

Reply via email to