tl;dr Can and should all jenkins agents be (automatically) docker authenticated, for improved stability around docker commands?
This past week the ci-cassandra.apache.org CI fell over because a fair percentage of docker pulls failed. Our pipeline runs a lot of docker containers. In the past week the number of containers run went from ~180 to ~270 and that pushed something over the edge. All of a sudden all docker commands had a significant chance of failing, that was high enough to ensure every pipeline was guaranteed to fail. These failures came in a few different forms, the list can be read in INFRA-21666. Googling them shows that this is a known problem, sometimes around firewalls, networks, dns, etc. Our jenkins agents are donated by a handful of different companies and are located in various different places, so such issues don't make much sense. The other typical fix reported was to just run docker authenticated, i.e. `docker login`. Trying this immediately solved all problems on ci-cassandra.apache.org. This was done with a temporary (and empty) dockerhub account, that each agent has manually logged in with. Based on all this, the request has been made for an official apache CI dockerhub account and to have jenkins agents automatically logged in, with credentials stored in an appropriate manner. Has anyone experience with such issues before? Is this a sound and reasonable request to ask of Infra? regards, Mick