tl;dr
Can and should all jenkins agents be (automatically) docker authenticated,
for improved stability around docker commands?


This past week the ci-cassandra.apache.org CI fell over because a fair
percentage of docker pulls failed. Our pipeline runs a lot of docker
containers. In the past week the number of containers run went from ~180 to
~270 and that pushed something over the edge. All of a sudden all docker
commands had a significant chance of failing, that was high enough to
ensure every pipeline was guaranteed to fail. These failures came in a few
different forms, the list can be read in INFRA-21666. Googling them shows
that this is a known problem, sometimes around firewalls, networks, dns,
etc. Our jenkins agents are donated by a handful of different companies and
are located in various different places, so such issues don't make much
sense. The other typical fix reported was to just run docker authenticated,
i.e. `docker login`. Trying this immediately solved all problems on
ci-cassandra.apache.org. This was done with a temporary (and empty)
dockerhub account, that each agent has manually logged in with. Based on
all this, the request has been made for an official apache CI
dockerhub account and to have jenkins agents automatically logged in, with
credentials stored in an appropriate manner.

Has anyone experience with such issues before?
Is this a sound and reasonable request to ask of Infra?

regards,
Mick

Reply via email to