On Thu, May 21, 2020 at 8:27 PM Magnus Hagander <mag...@hagander.net> wrote:
> > On Thu, May 21, 2020 at 1:08 PM Dave Page <dp...@pgadmin.org> wrote: > >> We see a non-trivial amount of automated build failures caused by git >> timeouts and Varnish cache meditation. This is only likely to get worse as >> we've automated so many different build configurations, and the PostgreSQL >> sysadmin team don't really want to get into the business of large-scale >> repository hosting. >> >> I suggest we move our primary repo to Github. Any objections? >> >> > Don't you already have a mirror there? Can't you just point the tests at > that, regardless of where you keep your primary? (It is distributed after > all) > I had discounted that idea as I thought the sync mechanism was done through a cron job (thus making it a potential problem when doing releases if a push doesn't hit the mirror immediately). However it does seem to be synchronous, having done a test commit and remembered the garbage we get on the output. How does that work exactly? I can't see any obvious hooks. I'll try just shifting the build systems to use Github and see where that takes us. > That said you are of course free to change the primary for whatever > reason, but this one doesn't seem like one. > > And AFAIK nobody has actually reported any such issues. But it is > certainly true that a lot of the git serving stuff is terribly slow -- but > I was under the impression that it was mostly gitweb, since thats what > people tend to report issues with... > > But again, no actual objections to moving. > We get a lot of failures that look like the following. It doesn't seem to be restricted to any particular servers in our buildfarm (which have a mix of 1 and 10Gb/s network connections on a 40Gb/s backbone), and the upstream network connection is stable (per monitoring) and has 500Mb/s bandwidth which should be more than enough of course. My working theory is that it's a dozen or more clones hitting gothos at once is just too much. Started by upstream project "pgadmin4-rpm-build <http://pgabf-jenkins.ox.uk.enterprisedb.com:8080/job/pgadmin4-rpm-build/>" build number 12 <http://pgabf-jenkins.ox.uk.enterprisedb.com:8080/job/pgadmin4-rpm-build/12> originally caused by: Started by upstream project "pgadmin4-all-build <http://pgabf-jenkins.ox.uk.enterprisedb.com:8080/job/pgadmin4-all-build/>" build number 61 <http://pgabf-jenkins.ox.uk.enterprisedb.com:8080/job/pgadmin4-all-build/61> originally caused by: Started by user Dave Page <http://pgabf-jenkins.ox.uk.enterprisedb.com:8080/user/dpage> Running as SYSTEM [EnvInject] - Loading node environment variables. Building remotely on pgabf-centos-7 <http://pgabf-jenkins.ox.uk.enterprisedb.com:8080/computer/pgabf-centos-7> (centos-7) in workspace /home/jenkins/workspace/pgadmin4-rpm-build/label/centos-7 [WS-CLEANUP] Deleting project workspace... [WS-CLEANUP] Deferred wipeout is used... [WS-CLEANUP] Done No credentials specified Cloning the remote Git repository Cloning repository https://git.postgresql.org/git/pgadmin4.git > git init /home/jenkins/workspace/pgadmin4-rpm-build/label/centos-7 # timeout=10 Fetching upstream changes from https://git.postgresql.org/git/pgadmin4.git > git --version # timeout=10 > git fetch --tags --progress https://git.postgresql.org/git/pgadmin4.git +refs/heads/*:refs/remotes/origin/* # timeout=10 ERROR: Error cloning remote repo 'origin' hudson.plugins.git.GitException: Command "git fetch --tags --progress https://git.postgresql.org/git/pgadmin4.git +refs/heads/*:refs/remotes/origin/*" returned status code 128: stdout: stderr: error: RPC failed; result=52, HTTP code = 0 fatal: The remote end hung up unexpectedly at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2430) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandWithCredentials(CliGitAPIImpl.java:2044) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.access$500(CliGitAPIImpl.java:81) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$1.execute(CliGitAPIImpl.java:569) at org.jenkinsci.plugins.gitclient.CliGitAPIImpl$2.execute(CliGitAPIImpl.java:798) at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$GitCommandMasterToSlaveCallable.call(RemoteGitImpl.java:161) at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$GitCommandMasterToSlaveCallable.call(RemoteGitImpl.java:154) at hudson.remoting.UserRequest.perform(UserRequest.java:211) at hudson.remoting.UserRequest.perform(UserRequest.java:54) at hudson.remoting.Request$2.run(Request.java:369) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Suppressed: hudson.remoting.Channel$CallSiteStackTrace: Remote call to pgabf-centos-7 at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1741) at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:356) at hudson.remoting.Channel.call(Channel.java:955) at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler.execute(RemoteGitImpl.java:146) at sun.reflect.GeneratedMethodAccessor673.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler.invoke(RemoteGitImpl.java:132) at com.sun.proxy.$Proxy71.execute(Unknown Source) at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1122) at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1167) at hudson.scm.SCM.checkout(SCM.java:505) at hudson.model.AbstractProject.checkout(AbstractProject.java:1205) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499) at hudson.model.Run.execute(Run.java:1853) at hudson.matrix.MatrixRun.run(MatrixRun.java:153) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:427) ERROR: Error cloning remote repo 'origin' [Boolean condition] checking [true] against [^(1|y|yes|t|true|on|run)$] (origin token: ${PUBLISH_FOR_QA}) Run condition [Boolean condition] enabling perform for step [[Send build artifacts over SSH]] SSH: Current build result is [FAILURE], not going to run. Finished: FAILURE -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company