Well I don't think the download time is significant [for all the builds at ANL] - as compared to the build times.
For ex: most of the time - petsc-pkg-hash gets reused [and this saves on both downloads and builds] - such builds take about 2h. But when packages have to be rebuilt - it can take 2:45 to 3h [so download part must be pretty small] But yeah - its wasted bandwidth - and not tolerant to network disruptions. And the other issue: might help with CI on low-bandwidth locations [say run a CI instance at my house on a spare laptop] But yes - this requires infrastructure. The way I look at it is - we need a "local mirror" or "cache" infrastructure. i.e keep the cache part separate from the build part [and not intertwine them] Spack does stuff in this direction [and also has remote cache as one of the 100 remote sites from where the packages can downloaded can be down - but its not tolerant to certain changes - so I have to periodically clean it - to have confidence in my build]. Note: If there is a git repo locally cached (and mirrored) - we don't have to deal with shallow clones. Might have a bigger impact if we can improve petsc-pkg-hash infrastructure to avoid rebuilds in more cases. [i.e make it more tolerant to configure changes - but its not clear to me - which changes wont require rebuilds] Satish On Sun, 11 Oct 2020, Barry Smith wrote: > > Satish, > > Do you think the time to download all the external packages for each job > is significant? > > Would using super shallow clones on the external packages help much in > time? Maybe we should to them anyways to stop wasting bandwidth? > Currently we do full clones? but we don't need the huge histories. > > A much more elaborate way to save more time > > On each test machine have repositories of all the external packages > > For each job, > > do pull in all these repositories from remote that job depends on > (usually this will get nothing so take no time) > > For each package either > > - build in a unique build directory of the repository directory > directly (for CMAKE and packages that support out of base directory builds) > > - make a local shallow clone of the local copy of the repository > to externalpackages for the rest and do those builds there > > The average cost of this will just some shallow local clones instead > of copying over from remote machines. > The PETSc test directories can still be completely cleaned out for > each job so Satish need not worry about testing with dirty directories. > > This requires a bit of infrastructure, if it saves a minute it is not > worth it, but if it cuts the pipeline time from 180 minutes to 150 maybe? > Probably not worth it. Could also be done just for a couple of the > most external package intense jobs. > > Barry > > > > > > > > >
