How does a targeted hardware donation work? I was under the impression that targeted donations are not accepted by the ASF. Maybe it is different in infrastructure, but this is the first time I've heard of it. Who does the donation on those projects? DataStax for Cassandra? Who for CouchDB? Google for Beam? By what process are the donations made and how are they audited to confirm the donation is spent on the desired resources? Can we get a contact for one of them for testimonial regarding this process? Is this process documented?
On Tue, Jul 24, 2018 at 4:27 PM Gav <ipv6g...@gmail.com> wrote: > Hi Andrew, > > On Wed, Jul 25, 2018 at 3:21 AM Andrew Purtell <apurt...@apache.org> > wrote: > >> Thanks for this note. >> >> I'm release managing the 1.4 release. I have been running the unit test >> suite on reasonably endowed EC2 instances and there are no observed always >> failing tests. A few can be flaky. In comparison the Apache test resources >> have been heavily resource constrained for years and frequently suffer from >> environmental effects like botched settings, disk space issues, and >> contention with other test executors. >> > > Our Jenkins nodes are configured via puppet these days and are pretty > stable, to which settings do you know of that might (still) be botched? > Yes, resources are shared and on occasion run to capacity. This is one > reason for my initial mail - these HBase builds are consuming 10 or more > executors > -at the same time- and are starving executors for other builds. The fact > these tests have been failing for well over a month and that you mention > below will be > ignoring them does not make for good cross ASF community spirit, we are > all in this together and every little bit helps. This is not a target at > one project, others > will be getting a similar note and I hope we can come to a resolution > suitable for all. > Disk space issues , yes, not on most of the Hadoop and related projects > nodes - H0-H12 do not have disk space issues. As a Hadoop related project > HBase should really be concentrating its builds there. > > >> I think a 1.4 release will happen regardless of the job test results on >> Apache infrastructure. I tend to ignore them as noisy and low signal. >> Others in the HBase community don't necessarily feel the same, so please >> don't take my viewpoint as particularly representative. We could try Alan's >> suggestion first, before ignoring them outright. >> > > No problem > > >> Has anyone given thought toward expanding the pool of test build >> resources? Or roping in cloud instances on demand? Jenkins has support for >> that. >> > > We have currently 19 Hadoop specific nodes available H0-H19 and another 28 > or so general use 'ubuntu' nodes for all to use. In addition we have > projects > that have targetted donated resources and the likes of Cassandra, CouchDB > and Beam all have multiple nodes on which they have priority. I'll throw an > idea > out there than perhaps HBase could do something similar to increase our > node pool and at the same time have priority on a few nodes f their own via > a targeted > hardware donation. > Cloud on demand has been tried a year or two ago, we will revisit this > also soon. > > Summary then, we currently have over 80 nodes connected to our Jenkins > master - what figure did you have in mind when you say 'expanding the pool > of test build resources' ? > > Thanks > > Gav... > > >> >> On Tue, Jul 24, 2018 at 9:16 AM Allen Wittenauer >> <a...@effectivemachines.com.invalid> wrote: >> >>> I suspect the bigger issue is that the hbase tests are running >>> on the ‘ubuntu’ machines. Since they only have ~300GB for workspaces, the >>> hbase tests are eating a significant majority of it and likely could be >>> dying randomly due to space issues. [All the hbase workspace directories + >>> the yetus-m2 shared mvn cache dirs easily consume 20%+ of the space. >>> Significantly more than the 50 or so other jobs that run on those >>> machines.] >>> >>> By comparison, most of the ‘Hadoop’ nodes have 2-3TB for the big >>> jobs to consume…. >>> >>> >>> > On Jul 24, 2018, at 8:58 AM, Josh Elser <els...@apache.org> wrote: >>> > >>> > Yep, sadly this is a very long tent-pole for us. There are many >>> involved who have invested countless hours in making this better. >>> > >>> > Specific to that job you linked earlier, 3 test failures out of our >>> total 4958 tests (0.06% failure rate) is all but "green" in my mind. I >>> would ask that you keep that in mind, too. >>> > >>> > To that extent, others have also built another job specifically to >>> find tests which are failing intermittently: >>> https://builds.apache.org/job/HBase-Find-Flaky-Tests/25513/artifact/dashboard.html. >>> I mention this as evidence to prove to you that this is not a baseless >>> request from the HBase PMC ;) >>> > >>> > On 7/24/18 3:14 AM, Gav wrote: >>> >> Ok, good enough, will wait, please also note 'master' branch and a few >>> >> others have been failing for over a month also. >>> >> I will check in again next month to see how things are progressing >>> >> Thanks >>> >> Gav... >>> >> On Tue, Jul 24, 2018 at 1:19 AM Josh Elser <els...@apache.org> wrote: >>> >>> Hi Gav, >>> >>> >>> >>> Looking at the most recent results, I see that the job failed >>> because of >>> >>> two unit test failures. These are something that will be looked at >>> prior >>> >>> to the next 1.4.x release which is about to get off the ground. >>> >>> >>> >>> I'd kindly request that you not disable the job. Thanks for trying to >>> >>> find extra resources on these nodes. >>> >>> >>> >>> On 7/23/18 12:22 AM, Gavin McDonald wrote: >>> >>>> https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/ >>> >>>> >>> >>>> can someone take a look into this, the job isnt much good if it is >>> >>> failing >>> >>>> all the time and even worse if it is being ignored. >>> >>>> >>> >>>> Otherwise I'll disable the job in a dew days to release these wasted >>> >>>> resources >>> >>>> to builds that matter. >>> >>>> >>> >>>> >>> >>> >>> >>> >> >> -- >> Best regards, >> Andrew >> >> Words like orphans lost among the crosstalk, meaning torn from truth's >> decrepit hands >> - A23, Crosstalk >> > > > -- > Gav... > -- Best regards, Andrew Words like orphans lost among the crosstalk, meaning torn from truth's decrepit hands - A23, Crosstalk