> Is there any chance we could work in/learn about build caching in this process? Full builds for Heron take several hours, it'd be nice to speed them up.
Fortunately, it is possible with s3[1]. For we upload them to s3 via travis[2]. We can choose the nearest region and using from travis ci seems straight forward [3]. [1] https://docs.aws.amazon.com/general/latest/gr/s3.html [2] https://docs.travis-ci.com/user/caching/#how-does-caching-work [3] https://docs.travis-ci.com/user/deployment/codedeploy/#s3-deployment-or-github-deployment Regards, Janardhan On Sun, Jan 30, 2022 at 2:29 AM Josh Fischer <j...@joshfischer.io> wrote: > > +1 from the Heron team. Is there any chance we could work in/learn about > build caching in this process? Full builds for Heron take several hours, > it'd be nice to speed them up. > > We use Bazel to build, here are some details: > https://docs.bazel.build/versions/main/remote-caching.html > > On Sat, Jan 29, 2022 at 1:27 PM Chris Lambertus <c...@apache.org> wrote: > > > There is no timeline and certainly no design doc. We have funding, but > > little in-house Infra experience with such an endeavor. We are looking for > > a community champion with experience in this area to help us design a > > solution. > > > > Our funding is in AWS, so yes, we could provide IAM access to specific > > services once we get a general idea of the type of solution we want to > > provide. > > > > Short term initiative: > > > > - develop a process for deploying 'on demand' build resources within > > Jenkins via EC2 > > - allow for the use of GPU nodes > > - figure out how to track usage and constrain spending within the funding > > limit > > - figure out how to deal with security push credentials (nexus, nightlies, > > dockerhub, etc.) > > > > Longer-term > > > > - provide EKS/ECS integration where appropriate > > > > The simplest case here would be for builds which are already containerized > > (e.g. don't require Infra-deployed dependencies), as we could deploy a > > "bare metal" AMI. If we needed to add the large number of tools Infra > > maintains, creating and updating the AMI would be quite cumbersome. This is > > something that will need to be sorted out if we are to roll out > > general-purpose build nodes 'on-demand'. > > > > Here are some points of note from the thread so far: > > > > - Amazon EC2 Plugin for Jenkins can help > > - GPU nodes desired by some projects > > - Use of auto-scaling groups rather than containers > > > > Projects interested in contributing to setup/design: > > > > - SystemDS > > - Airflow > > - Heron > > > > > > > > > > > On Jan 22, 2022, at 4:29 AM, Janardhan Pulivarthi < > > janardhan.pulivar...@gmail.com> wrote: > > > > > > Hi Chris, > > > > > > At present we would want to use AWS for GPU instances for testing and > > > for building docker (gpu) images. > > > > > > Is there any timeline or design doc. > > > > > > How does the quota work for projects? > > > Would you like to provide iam accounts with specific services in need > > > for a project? > > > > > > Thanks and Regards, > > > Janardhan > > > > > > On Sat, Jan 1, 2022 at 12:19 AM Allen Wittenauer > > > <a...@effectivemachines.com.invalid> wrote: > > >> > > >> > > >> > > >>> On Dec 30, 2021, at 10:58 AM, Chris Lambertus <c...@apache.org> wrote: > > >>> > > >>> Hi folks, > > >>> > > >>> We have some funding to explore providing ephemeral builds via ECS or > > EKS in the Amazon ecosystem, but Infra does not have expertise in this > > area. We would like to integrate such a service with Jenkins. > > >>> > > >>> Does anyone have experience with using these services for CI, and > > would you be interested in assisting Infra in developing a prototype? > > >>> > > >>> Additionally, we may be able to provide some build nodes with GPUs. Do > > we have projects which could/would make use of GPUs for integration testing? > > >> > > >> > > >> At $DAYJOB, I configured the Amazon EC2 plug-in ( > > https://plugins.jenkins.io/ec2 ) to do this type of thing using spot > > instances with labels tied to the particular EC2 node type that our jobs > > use. I avoided using the EC2 Fleet plug-in ( > > https://plugins.jenkins.io/ec2-fleet ) mainly because it always seemed to > > keep at least one node running which is not really want you want to get the > > most bang for your buck. In other words, startup time is less important to > > me than having a node run idle all weekend. > > >> > > >> Biggest issues we’ve hit with this setup are: > > >> > > >> a) Depending upon your spot price, you may get outbid and the node gets > > killed out from underneath you (rarely happens but it does happen with our > > bid) > > >> > > >> b) You need to know ahead of time what types of nodes you want to > > allocate and then set a label to match. For the ASF, that might be tricky > > given a lot of people have no idea what the actual requirements for their > > jobs are. > > >> > > >> c) During a Jenkins restart on rare occasions, the plug-in will ‘lose > > track’ of allocated nodes. We have limits for how long our allocations will > > last based on # of runs and idle time so generally can spot a ‘stuck’ node > > after a day or so. > > >> > > >> I haven’t tried configuring it use EKS because none of our stuff needs > > k8s yet. > > > >