Found this on dev@hadoop -> Moving to common-dev (the ML we use) I think there was some initiative to enable Windows Pre-Commit for every PR and that seems to have gone wild, either the number of PRs raised are way more than the capacity the nodes can handle or something got misconfigured in the job itself that the build is getting triggered for all the open PR not just new, which is leading to starvation of resources.
To the best of my knowledge @Gautham Banasandra <gaur...@apache.org> / @Iñigo Goiri <elgo...@gmail.com> are chasing the initiative, can you folks help check? There are concerns raised by the Infra team here [1] on dev@hadoop Most probably something messed up while configuring the hadoop-multibranch-windows job, it shows some 613 PR scheduled [2], I think it scheduled for all open ones, something similar happened long-long ago when we were doing migrations, can fetch pointers from [3] [1] https://lists.apache.org/thread/7nsyd0vtpb87fhm0fpv8frh6dzk3b3tl [2] https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/view/change-requests/builds [3] https://lists.apache.org/thread/8pxf2yon3r9g61zgv9cf120qnhrs8q23 -Ayush On 2024/04/26 16:59:04 Wei-Chiu Chuang wrote: > I'm not familiar with Windows build. But you may have better luck reaching > out to Apache Infra > https://infra.apache.org/contact.html > > mailing list, jira or even slack > > On Fri, Apr 26, 2024 at 9:42 AM Cesar Hernandez <cesargu...@gmail.com> > wrote: > > > Hello, > > An option that can be implemented in the Hadoop pipeline [1] is to set a > > timeout [2] on critical stages within the pipelines, for example in > > "Windows 10" stage . > > As for the issue the Ci build is logging [3] in the hadoop-multibranch jobs > > reported by Chris, it seems the issue is around the Post (cleanup) pipeline > > process. My two cents is to use cleanWs() instead of deleteDir() as > > documented in: https://plugins.jenkins.io/ws-cleanup/ > > > > [1] > > > > https://github.com/apache/hadoop/blob/trunk/dev-support/jenkinsfile-windows-10 > > > > [2] > > > > https://www.jenkins.io/doc/pipeline/steps/workflow-basic-steps/#timeout-enforce-time-limit > > > > [3] > > > > Still waiting to schedule task > > Waiting for next available executor on ‘Windows > > <https://ci-hadoop.apache.org/label/Windows/>’[Pipeline] // > > node[Pipeline] stage > > < > > https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-1137/1/console# > > >[Pipeline] > > { (Declarative: Post Actions) > > < > > https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-1137/1/console# > > >[Pipeline] > > script < > > https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-1137/1/console# > > >[Pipeline] > > { < > > https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-1137/1/console# > > >[Pipeline] > > deleteDir < > > https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-1137/1/console# > > >[Pipeline] > > }[Pipeline] // scriptError when executing cleanup post condition: > > Also: org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: > > ca1b7f2f-ec16-4bde-ac51-85f964794e37 > > org.jenkinsci.plugins.workflow.steps.MissingContextVariableException: > > Required context class hudson.FilePath is missing > > Perhaps you forgot to surround the code with a step that provides > > this, such as: node > > at > > org.jenkinsci.plugins.workflow.steps.StepDescriptor.checkContextAvailability(StepDescriptor.java:265) > > at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:300) > > at > > org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:196) > > at > > org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:124) > > at jdk.internal.reflect.GeneratedMethodAccessor1084.invoke(Unknown > > Source) > > at > > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > > at > > org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98) > > at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325) > > at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225) > > at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1034) > > at > > org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:41) > > at > > org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47) > > at > > org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116) > > at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:180) > > at > > org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:23) > > at > > org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:163) > > at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:178) > > at > > org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:182) > > at > > org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:152) > > at > > org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:152) > > at > > com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17) > > at > > org.jenkinsci.plugins.workflow.cps.LoggingInvoker.methodCall(LoggingInvoker.java:105) > > at WorkflowScript.run(WorkflowScript:196) > > at ___cps.transform___(Native Method) > > at > > com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:90) > > at > > com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:116) > > at > > com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixName(FunctionCallBlock.java:80) > > at jdk.internal.reflect.GeneratedMethodAccessor1046.invoke(Unknown > > Source) > > at > > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > > at > > com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72) > > at > > com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21) > > at com.cloudbees.groovy.cps.Next.step(Next.java:83) > > at > > com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:152) > > at > > com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:146) > > at > > org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:136) > > at > > org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:275) > > at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:146) > > at > > org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18) > > at > > org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51) > > at > > org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:187) > > at > > org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:423) > > at > > org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:331) > > at > > org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:295) > > at > > org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:97) > > at > > java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > > at > > hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139) > > at > > jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) > > at > > jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68) > > at > > jenkins.util.ErrorLoggingExecutorService.lambda$wrap$0(ErrorLoggingExecutorService.java:51) > > at > > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > > at > > java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > > at > > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > > at > > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > > at java.base/java.lang.Thread.run(Thread.java:829) > > [Pipeline] }[Pipeline] // stage[Pipeline] End of PipelineQueue task > > was cancelled > > org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: > > dc84ec50-8661-44a1-a7c0-ba575feca31d > > > > > > El vie, 26 abr 2024 a las 7:56, Chris Thistlethwaite (<chr...@apache.org >) > > escribió: > > > > > Greetings all! > > > > > > It was brought to my attention this morning that all the shared Jenkins > > > Windows nodes were leased out to ci-hadoop. Upon investigation, it > > > looks like there are several builds stuck for the last 3+ days. The > > > particular build in question is > > > https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/ > > > > > > There are a ton of Windows builds in the queue as well, so even if I > > > start killing these off, they are going to be taking over the nodes > > > again and likely failing/sticking at the same place. > > > > > > Can someone take a look at the build config? I'll have to force stop > > > these builds. > > > > > > Please add me to any replies as I'm not subbed to this list. > > > > > > Thanks! > > > -Chris T. > > > #asfinfra > > > > > > > > > -- > > Atentamente: > > César Hernández. > > >