After seen the log I understand you are asking for the INFO messages that 
inform that a Docker agent is disconnected, IIRC those messages are normal they 
only inform about the Docker agent status, you can change the verbose level of 
the Java package on logs configuration to omit those type of messages if they 
bother you.
About the other message the InterruptedException, this looks like and issue, 
but there is not much info to troubleshooting it, you have to monitor those 
errors and try to find something in common, same job always, same Docker image, 
Same resources, ... the most common issue is a resources problem, in those 
cases the container is killed because an OOM error, you can check if this is 
the case if you can make a Docker inspect of the container.

>> El 14 feb 2020, a las 21:47, Vincent Massol <vmas...@gmail.com> escribió:
> 
> Thanks Ivan. We're not using SHH agents but Docker Cloud (the agents are 
> provisioned on the fly as docker containers).
> 
> I was indeed looking for how to turn on some debugging on the agent side but 
> I couldn't find anything. Also the agent docker container is removed once the 
> job is finished so it seems even harder to get some info about what's going 
> on.
> 
> What I wanted to know is whether what we're experiencing is a normal behavior 
> of Jenkins or not. I'm asking because a lot of our jobs are going fine every 
> day but we stil have several ones that are killed in mid-air every day. For 
> example if I take agent 6 (a6) from 
> https://up1.xwikisas.com/#vI0VAypIpe_tD9LrQRTdMA I can see it's been 
> terminate on 2020-02-10 at:
> * 4:44
> * 5:06
> * 5:24
> * 7:45
> * 10:06
> * 10:24
> * etc
> 
> Now I don't think we have that many job failures every day. It's more like 1 
> or 2 per day. So I'm not sure what to think of it. 
> 
> I was trying to investigate why we see the following regularly (every day) in 
> our CI job logs:
> 
> Cannot contact Jenkins SSH Slave a6-009448n7sqon4: 
> java.lang.InterruptedException
> Agent Jenkins SSH Slave a6-009448n7sqon4 was deleted; cancelling node body
> Could not connect to Jenkins SSH Slave a6-009448n7sqon4 to send interrupt 
> signal to process
> 
> And then I discovered what I've pasted at 
> https://up1.xwikisas.com/#vI0VAypIpe_tD9LrQRTdMA by looking at the jenkins 
> master log file and I went "wow, how come there are so many disconnections".
> 
> Any idea is most welcome!
> 
> Thanks a lot
> -Vincent
> 
> 
> Le vendredi 14 février 2020 19:50:27 UTC+1, Ivan Fernandez Calvo a écrit :
>> 
>> Pingthread and some monitoring stuff run every 4 min, I think that the 
>> disconnections happens before that process but because there is not activity 
>> on this agents is not detected until the pingthread passes. So I guess you 
>> have half closed connections, I mean, the agent closes the convention but 
>> the master does not received the reset packet. If you are using SSH agents, 
>> you can enable the verbose mode on the sshd server to monitor what the heck 
>> happens see 
>> https://github.com/jenkinsci/ssh-slaves-plugin/blob/master/doc/TROUBLESHOOTING.md#common-info-needed-to-troubleshooting-a-bug
> 
> -- 
> You received this message because you are subscribed to a topic in the Google 
> Groups "Jenkins Users" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/jenkinsci-users/A1H9vVP-9c4/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to 
> jenkinsci-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/jenkinsci-users/6745b3f8-6da2-49b4-8e99-835fb67315dc%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/BBB88F2F-F7A6-4DB0-A7D7-18404B7B7B58%40gmail.com.

Reply via email to