My one hesitation is that it may cause problems by surfacing other
bugs. For example, say a delete operation succeeds in deleting the
image file, but fails in some subsequent code. The next time it runs,
it may fail in deleting the image file (file not found), and thus get
stuck in an endless loop trying to clean up.  I think it's ultimately
good to surface those (better than potentially orphaning disks on
storage), but it may be something to watch for.

On Thu, Jan 30, 2014 at 2:56 PM, Mike Tutkowski
<mike.tutkow...@solidfire.com> wrote:
> I agree, Marcus.
>
>
> On Thu, Jan 30, 2014 at 2:42 PM, Marcus <shadow...@gmail.com> wrote:
>
>> I think there's a hole in the volume lifecycle.  I've been noticing
>> volumes lingering that should have been cleaned up, and it seems to be
>> a bug in the state machine for the volumes:
>>
>>             s_fsm.addTransition(Destroy, Event.ExpungingRequested,
>> Expunging);
>>             s_fsm.addTransition(Expunging, Event.ExpungingRequested,
>> Expunging);
>>             s_fsm.addTransition(Expunging, Event.OperationSucceeded,
>> Expunged);
>>             s_fsm.addTransition(Expunging, Event.OperationFailed,
>> Expunging);
>>
>> If a volume is in Destroy state, it goes to Expunging when the delete
>> operation is requested. If the delete fails, it remains in expunging.
>> The storage garbage collector will never try to clean up that volume
>> again, since it only lists volumes in 'Destroy' and attempts those.
>> You can only get to Expunging from Destroy, it makes sense to change
>> that last line to revert the volume state back to Destroy if the
>> expunge operation failed, so that it will try again next time.
>>
>
>
>
> --
> *Mike Tutkowski*
> *Senior CloudStack Developer, SolidFire Inc.*
> e: mike.tutkow...@solidfire.com
> o: 303.746.7302
> Advancing the way the world uses the
> cloud<http://solidfire.com/solution/overview/?video=play>
> *(tm)*

Reply via email to