Hi Robbie, Yes, sorry about that. Those matrix jobs are tricky. I was so into the idea that the windows slaves were the bottleneck, that I didn't think about the possibility that it might be the other way around. My bad.
How come you can't select a node or label to use for a matrix job? On Tue, Apr 15, 2014 at 2:21 AM, Robbie Gemmell <robbie.gemm...@gmail.com> wrote: > '1' was possibly not stuck. It is a matrix project, although while the > matrix itself can launch on any node including the Windows ones (something > we apparently cant control) it doesnt use a numbered executor on the slave > while doing so which is how you killed 3 things when the node only has 2 > executors. The individual jobs within those matrix projects are restricted > to only run on the Ubuntu nodes, with each sub part getting scheduled > individually at the end of the job queue after the previous sub part > completes. Most of the time for the matrix running is simply spent waiting > for its parts to get to the front of the queue again. > > The project was defined that way to ensure we didnt effectively use a > larger single block of time (2 to 2.5hrs depending on the particular Ubuntu > nodes used and what else is running) the way many jobs do seem to, though > it means it can take a very long time for the matrix as a whole to complete > if the job queue is long due to the number of times it has to wait for each > part to get to the front of the queue. This seemed fairer than either > running the parts in a group of separate jobs or a single job and > effectively only queing once, but it does mean people see the matrix > sitting there doing not very much for quite some time. > > Though they weren't using any executors on the Windows nodes, I have > regardless disabled the periodic build on the job which triggers '1'. > > Robbie > > On 14 April 2014 20:37, Dennis Lundberg <denn...@apache.org> wrote: > >> I have just killed the following jobs on windows1, they had been stuck >> for 23+ hours: >> 1. https://builds.apache.org/job/Qpid-Java-Java-BDB-TestMatrix/ >> 2. https://builds.apache.org/job/river-qa-refactor-win6/ >> 3. https://builds.apache.org/job/ZooKeeper-trunk-WinVS2008_java/ >> >> Together they were effectively blocking all other projects that needed >> a windows slave. >> >> The problem with 1 is that it is triggered by >> https://builds.apache.org/job/Qpid-Java-Java-MMS-TestMatrix >> which in turn is on a periodical schedule (once a day, 0 9 * * *) as >> well as an SCM poll schedule (once every 15 minutes, */15 * * * *) >> >> The same problem goes for 3 which is on a periodical schedule (once a >> day, 30 8 * * *) >> >> In my opinion we should not allow periodical schedules. >> >> On Sun, Apr 13, 2014 at 10:46 AM, Gavin McDonald <ga...@16degrees.com.au> >> wrote: >> > Managed to kill 3 of them, looking into why. >> > >> > Gav… >> > >> > On 13/04/2014, at 7:01 AM, Erik de Bruin <e...@ixsoftware.nl> wrote: >> > >> >> Currently there are 4 builds stuck on the windows1 slave. They seem to >> have >> >> stopped on the SCM step right at the beginning of their builds. >> >> >> >> Can you please take a look? >> >> >> >> EdB >> >> >> >> >> >> >> >> >> >> On Fri, Apr 11, 2014 at 4:43 PM, Alex Harui <aha...@adobe.com> wrote: >> >> >> >>> Hi Jake, >> >>> >> >>> Thanks for restarting. I can't help but wonder if there is still some >> >>> configuration issue with Jenkins and Git that is causing Windows1 to >> run >> >>> out of memory. Is there an investigation going on in that regard? >> >>> >> >>> Thanks, >> >>> -Alex >> >>> >> >>> On 4/11/14 7:38 AM, "Jake Farrell" <jfarr...@apache.org> wrote: >> >>> >> >>>> Hey Erik >> >>>> Windows1 ran out of memory, restarted and builds in the queue have >> been >> >>>> picked up and are running >> >>>> >> >>>> -Jake >> >>>> >> >>>> >> >>>> On Fri, Apr 11, 2014 at 10:17 AM, Erik de Bruin <e...@ixsoftware.nl> >> >>>> wrote: >> >>>> >> >>>>> Same week, second time... The 'windows1' slave is offline. There are >> >>>>> builds that have been in the queue for over 12 hours, so it's not >> >>>>> 'idling'. >> >>>>> >> >>>>> Can someone look at this, please? >> >>>>> >> >>>>> Thanks, >> >>>>> >> >>>>> EdB >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> On Tue, Apr 8, 2014 at 1:08 AM, David Nalley <da...@gnsa.us> wrote: >> >>>>> >> >>>>>> Jan and I discussed this briefly at ApacheCon and are tossing around >> >>>>>> the idea of having Circonus monitor the status of the slave >> (according >> >>>>>> to Jenkins) and perhaps to take corrective action automagically. >> We're >> >>>>>> going to continue to think and work on this. Neither of us have >> admin >> >>>>>> privs on the Window's slaves, so we'd want folks that do (and are >> thus >> >>>>>> responsible for maintaining them) to bless this approach. >> >>>>>> >> >>>>>> --David >> >>>>>> >> >>>>>> >> >>>>>> On Mon, Apr 7, 2014 at 11:17 AM, Alex Harui <aha...@adobe.com> >> wrote: >> >>>>>>> Hi Jake, >> >>>>>>> >> >>>>>>> Is there some way you could create a "button" that we could hit to >> >>>>>> restart >> >>>>>>> the Windows slave so we don't have to keep bothering you? Or does >> it >> >>>>>>> require human intervention to get it to come back up? >> >>>>>>> >> >>>>>>> Maybe some script we can get at from people.a.o, or a custom >> Jenkins >> >>>>>> task >> >>>>>>> that we kick, or a button on the wiki that runs some script code? >> >>>>>>> >> >>>>>>> Thanks, >> >>>>>>> -Alex >> >>>>>>> >> >>>>>>> On 4/7/14 8:13 AM, "Erik de Bruin" <e...@ixsoftware.nl> wrote: >> >>>>>>> >> >>>>>>>> Good news. >> >>>>>>>> >> >>>>>>>> Excellent service, thank you! >> >>>>>>>> >> >>>>>>>> EdB >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> On Mon, Apr 7, 2014 at 4:22 PM, Jake Farrell <jfarr...@apache.org >> > >> >>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> Hey Erik >> >>>>>>>>> I just restarted windows 1 and it has picked up the Apache Flex >> >>>>>> build >> >>>>>>>>> and >> >>>>>>>>> is running it right now. >> >>>>>>>>> >> >>>>>>>>> -Jake >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On Mon, Apr 7, 2014 at 10:08 AM, Erik de Bruin < >> e...@ixsoftware.nl >> >>>> >> >>>>>>>>> wrote: >> >>>>>>>>> >> >>>>>>>>>> Hi, >> >>>>>>>>>> >> >>>>>>>>>> This is becoming a weekly event... both 'windows' slaves are >> >>>>>> offline, >> >>>>>>>>>> again. >> >>>>>>>>>> >> >>>>>>>>>> You might want to seriously consider accepting the offers to >> help >> >>>>>> from >> >>>>>>>>>> the friendly people in the "volunteering for ASF Jenkins farm >> >>>>>> service >> >>>>>>>>>> maintenance" thread. >> >>>>>>>>>> >> >>>>>>>>>> EdB >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> On Thu, Apr 3, 2014 at 7:22 PM, Jake Farrell < >> jfarr...@apache.org >> >>>> >> >>>>>>>>>> wrote: >> >>>>>>>>>> >> >>>>>>>>>>> restarted, builds should start getting picked up shortly >> >>>>>>>>>>> >> >>>>>>>>>>> -Jake >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> On Thu, Apr 3, 2014 at 1:05 PM, Erik de Bruin >> >>>>>> <e...@ixsoftware.nl> >> >>>>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>>> Hi, >> >>>>>>>>>>>> >> >>>>>>>>>>>> Both Windows slaves seem to be offline. There are several >> >>>>>> 'windows' >> >>>>>>>>>>> builds >> >>>>>>>>>>>> in the queue, so it seems they are not simply idling. Can you >> >>>>>> please >> >>>>>>>>>>> take a >> >>>>>>>>>>>> look? >> >>>>>>>>>>>> >> >>>>>>>>>>>> EdB >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> On Tue, Apr 1, 2014 at 9:20 AM, Jake Farrell >> >>>>>> <jfarr...@apache.org >> >>>>>>> >> >>>>>>>>>>> wrote: >> >>>>>>>>>>>> >> >>>>>>>>>>>>> Hey Justin >> >>>>>>>>>>>>> The builds look like they are working, now sure why java is >> >>>>>> giving >> >>>>>>>>>>> you >> >>>>>>>>>>>>> that >> >>>>>>>>>>>>> error for the latest java path since >> >>>>>>>>>>>>> /f/hudson/tools/java/latest-1.6-64/jre/bin/java.exe -version >> >>>>>> gives >> >>>>>>>>>>> me >> >>>>>>>>>>> a >> >>>>>>>>>>>>> print out of 1.6.0_27. if you wouldnt mind creating a ticket >> >>>>>> for >> >>>>>>>>>>> this >> >>>>>>>>>>> so >> >>>>>>>>>>>>> someone can investigate it I would appreciate it, its 3am for >> >>>>>> me >> >>>>>>>>>>> and I >> >>>>>>>>>>>>> need >> >>>>>>>>>>>>> to call it a night >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> -Jake >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> On Tue, Apr 1, 2014 at 3:09 AM, Justin Mclean < >> >>>>>>>>>>> jus...@classsoftware.com >> >>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>> >> >>>>>>>>>>>>>> Hi, >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> Flex-sdk_1 and flex-sdk_release fixed and started, looking >> >>>>>>>>>>> through the >> >>>>>>>>>>>>>>> other flex builds now >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>> https://builds.apache.org/view/E-G/view/Flex/job/flex-sdk_1/60/ >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>> >> https://builds.apache.org/view/E-G/view/Flex/job/flex-sdk_release/539/ >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> While it looks like they are compiling I noticed this: >> >>>>>>>>>>>>>> java.io.IOException: Cannot run program >> >>>>>>>>>>>>>> "f:\hudson\tools\java\latest-1.6-64\jre\bin\java.exe >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> So look like the version of java it expects to use is >> >>>>>> missing?? >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Justin >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> -- >> >>>>>>>>>>>> Ix Multimedia Software >> >>>>>>>>>>>> >> >>>>>>>>>>>> Jan Luykenstraat 27 >> >>>>>>>>>>>> 3521 VB Utrecht >> >>>>>>>>>>>> >> >>>>>>>>>>>> T. 06-51952295 >> >>>>>>>>>>>> I. www.ixsoftware.nl >> >>>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> -- >> >>>>>>>>>> Ix Multimedia Software >> >>>>>>>>>> >> >>>>>>>>>> Jan Luykenstraat 27 >> >>>>>>>>>> 3521 VB Utrecht >> >>>>>>>>>> >> >>>>>>>>>> T. 06-51952295 >> >>>>>>>>>> I. www.ixsoftware.nl >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> -- >> >>>>>>>> Ix Multimedia Software >> >>>>>>>> >> >>>>>>>> Jan Luykenstraat 27 >> >>>>>>>> 3521 VB Utrecht >> >>>>>>>> >> >>>>>>>> T. 06-51952295 >> >>>>>>>> I. www.ixsoftware.nl >> >>>>>>> >> >>>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> Ix Multimedia Software >> >>>>> >> >>>>> Jan Luykenstraat 27 >> >>>>> 3521 VB Utrecht >> >>>>> >> >>>>> T. 06-51952295 >> >>>>> I. www.ixsoftware.nl >> >>>>> >> >>> >> >>> >> >> >> >> >> >> -- >> >> Ix Multimedia Software >> >> >> >> Jan Luykenstraat 27 >> >> 3521 VB Utrecht >> >> >> >> T. 06-51952295 >> >> I. www.ixsoftware.nl >> > >> >> >> >> -- >> Dennis Lundberg >> -- Dennis Lundberg