Thanks Ivan. We're not using SHH agents but Docker Cloud (the agents are 
provisioned on the fly as docker containers).

I was indeed looking for how to turn on some debugging on the agent side 
but I couldn't find anything. Also the agent docker container is removed 
once the job is finished so it seems even harder to get some info about 
what's going on.

What I wanted to know is whether what we're experiencing is a normal 
behavior of Jenkins or not. I'm asking because a lot of our jobs are going 
fine every day but we stil have several ones that are killed in mid-air 
every day. For example if I take agent 6 (a6) from 
https://up1.xwikisas.com/#vI0VAypIpe_tD9LrQRTdMA I can see it's been 
terminate on 2020-02-10 at:
* 4:44
* 5:06
* 5:24
* 7:45
* 10:06
* 10:24
* etc

Now I don't think we have that many job failures every day. It's more like 
1 or 2 per day. So I'm not sure what to think of it. 

I was trying to investigate why we see the following regularly (every day) 
in our CI job logs:

Cannot contact Jenkins SSH Slave a6-009448n7sqon4: 
java.lang.InterruptedException
Agent Jenkins SSH Slave a6-009448n7sqon4 was deleted; cancelling node body
Could not connect to Jenkins SSH Slave a6-009448n7sqon4 to send interrupt 
signal to process

And then I discovered what I've pasted at 
https://up1.xwikisas.com/#vI0VAypIpe_tD9LrQRTdMA by looking at the jenkins 
master log file and I went "wow, how come there are so many disconnections".

Any idea is most welcome!

Thanks a lot
-Vincent


Le vendredi 14 février 2020 19:50:27 UTC+1, Ivan Fernandez Calvo a écrit :
>
> Pingthread and some monitoring stuff run every 4 min, I think that the 
> disconnections happens before that process but because there is not 
> activity on this agents is not detected until the pingthread passes. So I 
> guess you have half closed connections, I mean, the agent closes the 
> convention but the master does not received the reset packet. If you are 
> using SSH agents, you can enable the verbose mode on the sshd server to 
> monitor what the heck happens see 
> https://github.com/jenkinsci/ssh-slaves-plugin/blob/master/doc/TROUBLESHOOTING.md#common-info-needed-to-troubleshooting-a-bug
>  
> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Fjenkinsci%2Fssh-slaves-plugin%2Fblob%2Fmaster%2Fdoc%2FTROUBLESHOOTING.md%23common-info-needed-to-troubleshooting-a-bug&sa=D&sntz=1&usg=AFQjCNFInvV2jEZnSZ_-KN3YkxCp6g7igA>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/6745b3f8-6da2-49b4-8e99-835fb67315dc%40googlegroups.com.

Reply via email to