Found this on dev@hadoop -> Moving to common-dev (the ML we use)

I think there was some initiative to enable Windows Pre-Commit for every PR
and that seems to have gone wild, either the number of PRs raised are way
more than the capacity the nodes can handle or something got misconfigured
in the job itself that the build is getting triggered for all the open PR
not just new, which is leading to starvation of resources.

To the best of my knowledge
@Gautham Banasandra <gaur...@apache.org> / @Iñigo Goiri <elgo...@gmail.com> are
chasing the initiative, can you folks help check?

There are concerns raised by the Infra team here [1] on dev@hadoop

Most probably something messed up while configuring the
hadoop-multibranch-windows job, it shows some 613 PR scheduled [2], I think
it scheduled for all open ones, something similar happened long-long ago
when we were doing migrations, can fetch pointers from [3]

[1] https://lists.apache.org/thread/7nsyd0vtpb87fhm0fpv8frh6dzk3b3tl
[2]
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/view/change-requests/builds
[3] https://lists.apache.org/thread/8pxf2yon3r9g61zgv9cf120qnhrs8q23

-Ayush


On 2024/04/26 16:59:04 Wei-Chiu Chuang wrote:
> I'm not familiar with Windows build. But you may have better luck reaching
> out to Apache Infra
> https://infra.apache.org/contact.html
>
> mailing list, jira or even slack
>
> On Fri, Apr 26, 2024 at 9:42 AM Cesar Hernandez <cesargu...@gmail.com>
> wrote:
>
> > Hello,
> > An option that can be implemented in the Hadoop pipeline [1] is to set a
> > timeout [2] on critical stages within the pipelines, for example in
> > "Windows 10" stage .
> > As for the issue the Ci build is logging [3] in the hadoop-multibranch
jobs
> > reported by Chris, it seems the issue is around the Post (cleanup)
pipeline
> > process. My two cents is to use cleanWs() instead of deleteDir() as
> > documented in: https://plugins.jenkins.io/ws-cleanup/
> >
> > [1]
> >
> >
https://github.com/apache/hadoop/blob/trunk/dev-support/jenkinsfile-windows-10
> >
> > [2]
> >
> >
https://www.jenkins.io/doc/pipeline/steps/workflow-basic-steps/#timeout-enforce-time-limit
> >
> > [3]
> >
> > Still waiting to schedule task
> > Waiting for next available executor on ‘Windows
> > <https://ci-hadoop.apache.org/label/Windows/>’[Pipeline] //
> > node[Pipeline] stage
> > <
> >
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-1137/1/console#
> > >[Pipeline]
> > { (Declarative: Post Actions)
> > <
> >
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-1137/1/console#
> > >[Pipeline]
> > script <
> >
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-1137/1/console#
> > >[Pipeline]
> > { <
> >
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-1137/1/console#
> > >[Pipeline]
> > deleteDir <
> >
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-1137/1/console#
> > >[Pipeline]
> > }[Pipeline] // scriptError when executing cleanup post condition:
> > Also:   org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId:
> > ca1b7f2f-ec16-4bde-ac51-85f964794e37
> > org.jenkinsci.plugins.workflow.steps.MissingContextVariableException:
> > Required context class hudson.FilePath is missing
> > Perhaps you forgot to surround the code with a step that provides
> > this, such as: node
> >         at
> >
org.jenkinsci.plugins.workflow.steps.StepDescriptor.checkContextAvailability(StepDescriptor.java:265)
> >         at
org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:300)
> >         at
> > org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:196)
> >         at
> >
org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:124)
> >         at
jdk.internal.reflect.GeneratedMethodAccessor1084.invoke(Unknown
> > Source)
> >         at
> >
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >         at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> >         at
> > org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
> >         at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
> >         at
groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
> >         at
groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1034)
> >         at
> >
org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:41)
> >         at
> >
org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
> >         at
> >
org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
> >         at
org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:180)
> >         at
> >
org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:23)
> >         at
> >
org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:163)
> >         at
org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:178)
> >         at
> > org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:182)
> >         at
> > org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:152)
> >         at
> > org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:152)
> >         at
> >
com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17)
> >         at
> >
org.jenkinsci.plugins.workflow.cps.LoggingInvoker.methodCall(LoggingInvoker.java:105)
> >         at WorkflowScript.run(WorkflowScript:196)
> >         at ___cps.transform___(Native Method)
> >         at
> >
com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:90)
> >         at
> >
com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:116)
> >         at
> >
com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixName(FunctionCallBlock.java:80)
> >         at
jdk.internal.reflect.GeneratedMethodAccessor1046.invoke(Unknown
> > Source)
> >         at
> >
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >         at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> >         at
> >
com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
> >         at
> > com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
> >         at com.cloudbees.groovy.cps.Next.step(Next.java:83)
> >         at
> > com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:152)
> >         at
> > com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:146)
> >         at
> >
org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:136)
> >         at
> >
org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:275)
> >         at
com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:146)
> >         at
> >
org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18)
> >         at
> >
org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51)
> >         at
> >
org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:187)
> >         at
> >
org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:423)
> >         at
> >
org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:331)
> >         at
> >
org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:295)
> >         at
> >
org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:97)
> >         at
> > java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> >         at
> >
hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
> >         at
> >
jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
> >         at
> >
jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
> >         at
> >
jenkins.util.ErrorLoggingExecutorService.lambda$wrap$0(ErrorLoggingExecutorService.java:51)
> >         at
> >
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> >         at
> > java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> >         at
> >
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> >         at
> >
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> >         at java.base/java.lang.Thread.run(Thread.java:829)
> > [Pipeline] }[Pipeline] // stage[Pipeline] End of PipelineQueue task
> > was cancelled
> > org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId:
> > dc84ec50-8661-44a1-a7c0-ba575feca31d
> >
> >
> > El vie, 26 abr 2024 a las 7:56, Chris Thistlethwaite (<chr...@apache.org
>)
> > escribió:
> >
> > > Greetings all!
> > >
> > > It was brought to my attention this morning that all the shared
Jenkins
> > > Windows nodes were leased out to ci-hadoop. Upon investigation, it
> > > looks like there are several builds stuck for the last 3+ days. The
> > > particular build in question is
> > > https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/
> > >
> > > There are a ton of Windows builds in the queue as well, so even if I
> > > start killing these off, they are going to be taking over the nodes
> > > again and likely failing/sticking at the same place.
> > >
> > > Can someone take a look at the build config? I'll have to force stop
> > > these builds.
> > >
> > > Please add me to any replies as I'm not subbed to this list.
> > >
> > > Thanks!
> > > -Chris T.
> > > #asfinfra
> > >
> >
> >
> > --
> > Atentamente:
> > César Hernández.
> >
>

Reply via email to