"Daniel P. Berrange" <berra...@redhat.com> wrote:
> On Tue, Sep 20, 2011 at 03:24:41PM +0200, Juan Quintela wrote:
>> If we have one error while migrating, and then we issuse a
>> "migrate_cancel" command, guest hang.  Fix it for flushing only when
>> migration is in MIG_STATE_ACTIVE.  In case of error of cancellation,
>> don't flush.
>> 
>> We had an infinite loop at buffered_close()
>> 
>>         while (!s->has_error && s->buffer_size) {
>>             buffered_flush(s);
>>             if (s->freeze_output)
>>                 s->wait_for_unfreeze(s);
>>         }
>> 
>> There was no errors, there were things to send, and connection was
>> broken.  send() returns -EAGAIN, so we freezed output, but we
>> unfreeze_output and try again.
>
> I posted a couple of alternative approaches to fixing this
> hang problem
>
> http://lists.nongnu.org/archive/html/qemu-devel/2011-08/msg03248.html
>
> My second approach of checking the migration state in migrate_fd_put_buffer()
> seems like it would be worthwhile, even with your patch as an additional
> safety net against bad code.

We can add that there, but in my tests, the s->write() was returning
correctly an error (or -EAGAIN).  The problem was that we were not
exiting when we didn't needed to.

I agree that we can have *both* tests.  I will add your patch to my
series.

Thanks for the fast review.

Later, Juan.

Reply via email to