Hi, I've been working on a check job that uses devstack-gate jobs to run the nova with the docker driver, while doing this I noticed that sometimes during the nova boot for an instance the node looses network connectivity(obviously a problem that needs to be worked on). Whats interesting is zuuls behavior when this occurs in the check queue. The job simply got restarted and this kept happening until the job passed.
A legitimately failed job : https://jenkins05.openstack.org/job/check-nova-docker-dsvm-f20/2/ http://logs.openstack.org/14/91514/5/check/check-nova-docker-dsvm-f20/d5c1ebf/console.html Retry (also failed) : https://jenkins07.openstack.org/job/check-nova-docker-dsvm-f20/3/ http://logs.openstack.org/14/91514/5/check/check-nova-docker-dsvm-f20/d5f26ed/console.html Retried again (passed) : https://jenkins01.openstack.org/job/check-nova-docker-dsvm-f20/3/ http://logs.openstack.org/14/91514/5/check/check-nova-docker-dsvm-f20/2ebfa88/console.html And success gets reported back to gerrit https://review.openstack.org/#/c/91514/ Patch Set 5: Verified+1 check-nova-docker-dsvm-f20 SUCCESS in 17m 27s (non-voting) Wouldn't this behavior allow commits that cause intermittent network problems to more easily sneak passed the gating infrastructure? I'm guessing that the retry is being triggered in zuul/launcher/gearman.py : onBuildCompleted() because onDisconnect calls onBuildCompleted with no results param Any thoughts? thanks, Derek. _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev