Hi, Craig.

Yes, i thought about configurations test suits.
For now core team, maybe, should extend gate running time.
But for the tempest tests i would suggest to exclude some tests from
'gate'-group (the longest ones).
We need to deal with it asap, because gate failing for four or five days.

Best regards
Denis Makogon.

Sent from an iPad


On Mon, Feb 17, 2014 at 6:33 AM, Craig Vyvial <cp16...@gmail.com> wrote:

> Trovesters,
>
> One reason for the longer running test was that for the configuration
> groups i added a creation of a new instance. This is to test a new instance
> will be created with a configuration group applied. This might be causing
> the run to be a little longer but i am surprised that its taking over an
> hour to run through everything still.
>
> -Craig Vyvial
>
>
> On Sun, Feb 16, 2014 at 12:25 AM, Mirantis <dmako...@mirantis.com> wrote:
>
>> Hello, Mathew.
>>
>> I'm seeing same issues with the gate.
>> I also tried to found out why gate job is failing. First ran into issue
>> related to cinder installation failure in devstack. But then I found same
>> problem as you described. The best option is to increase job time range.
>> Thanks for such research. I hope gate will be fixed in the easiest way
>> and for the shortest period of time.
>>
>> Best regards
>> Denis Makogon.
>> Sent from an iPad
>>
>> 16 февр. 2014, в 00:46, "Lowery, Mathew" <mlow...@ebay.com> написал(а):
>>
>>  Hi all,
>>
>>  *Issue #1: Jobs that need more than one hour*
>>
>>  Of the last 30 Trove-Gate 
>> <https://rdjenkins.dyndns.org/job/Trove-Gate/>builds (spanning three days), 
>> 7 have failed due to a Jenkins job-level
>> timeout (not a proboscis timeout). These jobs had no failed tests when the
>> timeout occurred.
>>
>>  Not having access to the job config to see what the job looks like, I
>> used the console output to guess what was going on. It appears that a
>> Jenkins plugin named 
>> boot-hpcloud-vm<https://github.com/mrhoades/boot-hpcloud-vm/blob/2272770b0ce54752eabb84229dc8939d79b2be50/models/boot_vm_concurrent.rb#L181>
>>  is
>> booting a VM and running the commands given, including redstack int-tests.
>> From the console output, it states that it was supplied with an
>> ssh_shell_timeout="7200". This is passed down to another library called
>> net-ssh-simple<https://github.com/busyloop/net-ssh-simple/blob/e3834f259a47606bfb06a487ca701fc20dbad8a5/lib/net/ssh/simple.rb#L632>.
>> net-ssh-simple has two timeouts: an idle timeout and an operation timeout.
>>
>>  In the latest 
>> boot-hpcloud-vm<https://github.com/mrhoades/boot-hpcloud-vm/blob/2272770b0ce54752eabb84229dc8939d79b2be50/models/boot_vm_concurrent.rb#L182>,
>> ssh_shell_timeout is passed down to net-ssh-simple for both the idle
>> timeout and the operation timeout. But in older versions of
>> boot-hp-cloud-vm<https://github.com/mrhoades/boot-hpcloud-vm/blob/9260e957d6c54142c33dd9e9632b86e17fd5c02f/models/boot_vm_concurrent.rb#L141>,
>> ssh_shell_timeout is passed down to net-ssh-simple for only the idle
>> timeout, leaving a default operation timeout of 3600. This is why I believe
>> these jobs are failing after exactly one hour.
>>
>>  FYI: Here are the jobs that failed due to the Jenkins job-level timeout
>> (and had no test failures when the timeout occurred) along with their
>> associated patch sets:
>> https://rdjenkins.dyndns.org/job/Trove-Gate/2532/console (
>> http://review.openstack.org/73786)
>> https://rdjenkins.dyndns.org/job/Trove-Gate/2530/console (
>> http://review.openstack.org/73736)
>> https://rdjenkins.dyndns.org/job/Trove-Gate/2517/console (
>> http://review.openstack.org/63789)
>> https://rdjenkins.dyndns.org/job/Trove-Gate/2514/console (
>> https://review.openstack.org/50944)
>> https://rdjenkins.dyndns.org/job/Trove-Gate/2513/console (
>> https://review.openstack.org/50944)
>> https://rdjenkins.dyndns.org/job/Trove-Gate/2504/console (
>> https://review.openstack.org/73147)
>> https://rdjenkins.dyndns.org/job/Trove-Gate/2503/console (
>> https://review.openstack.org/73147)
>>
>>   *Suggested action items:*
>>
>>    - If it is acceptable to have jobs that run over one hour, then
>>    install the latest boot-hpcloud-vm plugin for Jenkins which will increase
>>    the make the operation timeout match the idle timeout.
>>
>>
>>  *Issue #2: The running time of all jobs is 1 hr 1 min*
>>
>>  While the Jenkins job-level timeout will end the job after one hour, it
>> also appears to keep every job running for a minimum of one hour.  To be
>> more precise, the timeout (or minimum running time) occurs on the part of
>> the Jenkins job that runs commands on the VM; the VM provision (which takes
>> about one minute) is excluded from this timeout which is why the running
>> time of all jobs is around 1 hr 1 
>> min<https://rdjenkins.dyndns.org/job/Trove-Gate/buildTimeTrend>.
>> A sampling of console logs showing the time the int-tests completed and
>> when the timeout kicks in:
>>
>>  https://rdjenkins.dyndns.org/job/Trove-Gate/2531/console (00:01:03
>> wasted)
>>
>> *04:51:12* COMMAND_0: echo refs/changes/36/73736/2
>>
>> ...
>>
>> *05:50:10*     335.41     proboscis.case.MethodTest 
>> (test_instance_created)*05:50:10*     194.05     proboscis.case.MethodTest 
>> (test_instance_returns_to_active_after_resize)*05:51:13* 
>> ***************************************05:51:13* ****** STDERR-BEGIN ******
>>
>>
>>  https://rdjenkins.dyndns.org/job/Trove-Gate/2521/console (00:06:44
>> wasted)
>>
>> *21:11:44* COMMAND_0: echo refs/changes/89/63789/13
>>
>> ...
>>
>> *22:05:00*     195.11     proboscis.case.MethodTest 
>> (test_instance_returns_to_active_after_resize)*22:05:00*     186.89     
>> proboscis.case.MethodTest (test_resize_down)*22:11:44* 
>> ***************************************22:11:44* ****** STDERR-BEGIN ******
>>
>>
>> https://rdjenkins.dyndns.org/job/Trove-Gate/2518/consoleFull (00:06:01
>> wasted)
>>
>> *17:46:59* COMMAND_0: echo refs/changes/02/64302/20
>>
>> ...
>>
>> *18:40:57*     210.03     proboscis.case.MethodTest 
>> (test_instance_returns_to_active_after_resize)*18:40:57*     187.89     
>> proboscis.case.MethodTest (test_resize_down)*18:46:58* 
>> ***************************************18:46:58* ****** STDERR-BEGIN ******
>>
>>
>> *Suggested action items:*
>>
>>
>>    -
>>
>>    Given that the minimum running time is one hour, I assume the problem is 
>> in the net-ssh-simple library. Needs more investigation.
>>
>>
>>
>> *Issue #3: Jenkins console log line timestamps different between full and 
>> truncated views*
>>
>>
>>  I assume this is due to 
>> JENKINS-17779<https://issues.jenkins-ci.org/browse/JENKINS-17779>
>> .
>>
>>  *Suggested action items:*
>>
>>    - Upgrade the timestamper 
>> plugin<https://wiki.jenkins-ci.org/display/JENKINS/Timestamper>
>>    .
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to