Bryan Beaudreault created HBASE-28455:
-----------------------------------------
Summary: do-release-docker fails to setup gpg agent proxy if proxy
container is slow to start
Key: HBASE-28455
URL: https://issues.apache.org/jira/browse/HBASE-28455
Project: HBase
Issue Type: Improvement
Reporter: Bryan Beaudreault
In do-release-docker.sh we spin up the gpg-agent-proxy container and then
immediately run ssh-keyscan and then immediately run ssh. Despite having
{{{}set -e{}}}, both of these can fail without failing the script. This
manifests as a really hard to debug failure in the hbase-rm container with
"gpg: no gpg-agent running in this session"
With some debugging I realized that the ssh tunnel had not been created.
looking at the logs, the gpg-agent-proxy.ssh-keyscan file is empty and the
gpg-proxy.ssh.log shows a Connection refused error.
You'd think these would fail the script, but they don't for different reasons:
# ssh-keyscan output is piped through sort. Running ssh-keyscan directly
returns an error code, but piping it through sort turns it into a success code.
# ssh is executed in background with {{{}&{}}}, which similarly loses the
error code
I think we should add a step prior to ssh-keyscan which waits until port 62222
is available. I'm not sure how to retain the error codes in the above 2
commands, but can try to look into that as well.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)