> On Dec 30, 2021, at 10:58 AM, Chris Lambertus <c...@apache.org> wrote:
>
> Hi folks,
>
> We have some funding to explore providing ephemeral builds via ECS or EKS in
> the Amazon ecosystem, but Infra does not have expertise in this area. We
> would like to integrate such a service with Jenkins.
>
> Does anyone have experience with using these services for CI, and would you
> be interested in assisting Infra in developing a prototype?
>
> Additionally, we may be able to provide some build nodes with GPUs. Do we
> have projects which could/would make use of GPUs for integration testing?
At $DAYJOB, I configured the Amazon EC2 plug-in (
https://plugins.jenkins.io/ec2 ) to do this type of thing using spot instances
with labels tied to the particular EC2 node type that our jobs use. I avoided
using the EC2 Fleet plug-in ( https://plugins.jenkins.io/ec2-fleet ) mainly
because it always seemed to keep at least one node running which is not really
want you want to get the most bang for your buck. In other words, startup time
is less important to me than having a node run idle all weekend.
Biggest issues we’ve hit with this setup are:
a) Depending upon your spot price, you may get outbid and the node gets killed
out from underneath you (rarely happens but it does happen with our bid)
b) You need to know ahead of time what types of nodes you want to allocate and
then set a label to match. For the ASF, that might be tricky given a lot of
people have no idea what the actual requirements for their jobs are.
c) During a Jenkins restart on rare occasions, the plug-in will ‘lose track’ of
allocated nodes. We have limits for how long our allocations will last based
on # of runs and idle time so generally can spot a ‘stuck’ node after a day or
so.
I haven’t tried configuring it use EKS because none of our stuff needs k8s yet.