Hey folks,

Some of my recent reviews have been frequent fliers in the land of CI gate jobs 
and I've spent a fair amount of time diagnosing random ssh failures to 
containers in AIO builds.  The error I get most often is this:

    SSH Error: data could not be sent to the remote host. Make sure this host 
can be reached over ssh

After digging in Ansible code for a bit, I found the error within the ssh 
connection plugin[1].  It looks like an issue where the ssh connection is 
actually open but data cannot be sent to the subprocess.

I messed around heavily with multiplexing, keys, GSSAPI, and more, but the 
errors randomly appear.  I've proposed a review[2] for a switch to paramiko 
transport mode for gate jobs only and it has run four times without ssh errors 
(although two builds had timeouts due to the repo build taking too long).

The fifth build is running now and it seems to be moving along fairly quickly.

[1] 
https://github.com/ansible/ansible/blob/devel/lib/ansible/plugins/connection/ssh.py#L245-L260
[2] https://review.openstack.org/#/c/248361/

--
Major Hayden

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to