On 11/28/2017 9:13 AM, Gustavo Randich wrote:
(running Mitaka)

When doing block live-migration, if the image / backing file is not present at destination host, sometimes pre-live migration fails after 60 seconds as shown below. Retrying the migration to the same destination host succeeds.

It seems that an rpc_response_timeout of 60 seconds is not enough for this scenario, in which fetching the image involves 90 seconds. We don't like to increase rpc_response_timeout  to say, 120 seconds, only for this reason ('cause in other kind of errors we prefer to fail fast).

Given that migrations are usually long, shouldn't this operation be under the scope of a configurable timeout such as live_migration_progress_timeout or live_migration_completion_timeout which overrides the default rpc timeout?

I think we've talked about adding a config option or somehow doing rpc timeouts differently for operations that we know are prone to timeouts, so I don't think people would be against a config option for this. I know there is at least one place in nova where we specify an rpc response timeout which is not the default.

--

Thanks,

Matt

_______________________________________________
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Reply via email to