Hi all,

Back in late October, Vasyl wrote support for devstack to auto detect, and when 
possible, use kvm to power Ironic gate jobs 
(0036d83b330d98e64d656b156001dd2209ab1903). This has lowered some job time when 
it works, but has caused failures — how many? It’s hard to quantify as the log 
messages that show the error don’t appear to be indexed by elastic search. It’s 
something seen often enough that the issue has become a permanent staple on our 
gate whiteboard, and doesn’t appear to be decreasing in quantity.

I pushed up a patch, https://review.openstack.org/#/c/421581, which keeps the 
auto detection behavior, but defaults devstack to use qemu emulation instead of 
kvm.

I have two questions:
1) Is there any way I’m not aware of we can quantify the number of failures 
this is causing? The key log message, "KVM: entry failed, hardware error 0x0”, 
shows up in logs/libvirt/qemu/node-*.txt.gz.
2) Are these failures avoidable or visible in any way?

IMO, if we can’t fix these failures, in my opinion, we have to do a change to 
avoid using nested KVM altogether. Lower reliability for our jobs is not worth 
a small decrease in job run time.

Thanks,
Jay Faulkner
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to