[vpp-dev] current verify issues and jenkins build queue

Ed Kern via Lists.Fd.Io Tue, 08 Oct 2019 08:59:59 -0700

Problems currently still ongoing:
        1device cluster worker nodes are currently down.. I’ve notified csit in 
slack and am cc’ing them here..  In the meantime I have a gerrit to remove 
1device per patch 
                so it doesn’t delay voting on verify jobs.
        Jenkins just crashed so that will take awhile to sort.
                vanessa and I are trying to just empty the build queue at this 
point to get back to zero so jenkins won’t just crash again when it gets opened.



History:

root cause: 
        a. will have to wait on csit folks for answers on the two 1device node 
failure
        b. during the night the internal docker registry stopped responding 
(but still passed socket health check so didnt fail over)

Workflow:
        1. I saw there was an issue reading email around 6am pacific this 
morning. 
        2. saw that the registry wasn’t responding and attempted restart.
        3. due to the jenkins server queue hammering on the nomad cluster it 
took a long while to get that restart to go through (roughly 40 min)
        4. once the bottle was uncorked the sixty jobs pending (including a 
large number of checkstyle jobs) turned into 160.
        5. jenkins ‘chokes’ and crashes
        6. ‘we’ start scrubbing the queue which will cause a huge number of 
rechecks but at least jenkins wont crash again..

******current time

  future:
        7.  will force the ci-man patch removing per patch verify 
        8. jenkins queue will re-open and ill send another email.
        9. Im adding myself to the queue high threshold alarm LF system so I 
get paged/called when the queue gets above 90 (their current severe water mark)
        10. Ill see if i can find a way to troll gerrit to manually recheck 
what i can find


more as it rolls along

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14147): https://lists.fd.io/g/vpp-dev/message/14147
Mute This Topic: https://lists.fd.io/mt/34443895/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] current verify issues and jenkins build queue

Reply via email to