Hello, This problem is strangely transient. I've seen it happen to others when it wasn't happening to me with the same remote machine. Now I am having this problem again on 2 different servers that I manage. I dug around a bit and found that calls to 'open-remote-pipe*' from guile-ssh have some chance of failure even though the SSH session is fine. This procedure is called many times during a deploy, so the odds are high that one of them will fail. I got lucky once today and had a deploy finish but that was after many failures. I was able to unblock myself by hacking call sites to repeatedly call 'open-remote-pipe*' in a loop, like this:
(let loop () (or (false-if-exception (apply open-remote-pipe* session OPEN_BOTH repl-command)) (loop))) I also added some 'pk' logging and found that 'open-remote-pipe*' would typically succeed on the first or second try. I think there could be a bit more investigation done to better understand *why* this happens in the first place, but as a resiliency tactic I think it would be appropriate to write a wrapper procedure that retries a few times before giving up. Thoughts? - Dave