Thanks for the explanation. Is there documentation anywhere about Apache infrastructure's standards and requirements for external slaves?
Regards, Dave Sent from my iPhone > On Feb 9, 2017, at 4:32 PM, Greg Stein <gst...@gmail.com> wrote: > > On Thu, Feb 9, 2017 at 5:53 PM, Allen Wittenauer <a...@effectivemachines.com> > wrote: >> ... > >> The Mac OS X host was shut down literally a day after I sent out >> an email to common-dev@hadoop announcing I had full build and patch >> testing working. I had spent quite a bit of time getting Apache Yetus >> ported over to work on Apache's OS X machine, then spent over a month on >> working out the Hadoop specifics, running build after build after build. >> Competing with the Apache Mesos jobs that also ran on that box. The reason >> I was told it was killed was: "no one was using it". (Umm, what? Clearly >> no one bothered looking at the build log.) >> > > This occurred before I started working as the Infrastructure Administrator > (last Fall). I don't know the full background, other than a PMC requested > that buildbot, then never used it. Yeah: maybe the build logs weren't > examined to see that other projects had hopped onto it. > > I also believe we had to pay for that box, and it wasn't cheap. > > Today, our preferred model for non-Ubuntu boxes is to have other people > own/run/manage those buildbots and hook them into our buildmaster. For > example, people on the Apache Subversion project have several such 'bots. > > We are concentrating our in-house experience on the Ubuntu platform, from > both an operational and a cost angle. Four years ago, the Infra team had > many fewer projects to support. Today, we have hundreds of projects and > many thousands of committers to support. We've had to reallocate in order > to meet the incredible growth of the ASF. > > Unfortunately, especially for yourself and some others, the "smoothing down > the edges" has been detrimental. > > In parallel, I started working on the Solaris box.... which was >> then promptly shutdown not too long after I had filed a jira to see if we >> could get the base CA certificates upgraded. (which was pretty much all I >> needed, after that I could have finished getting the Hadoop builds working >> on it as well). >> > > We're still shutting down Solaris. Only one guy has experience with it, and > he's also got a ton of other stuff to do. > > Our hardware that runs Solaris is also *very* old. Worse: we could never > get a support contract for it. They wouldn't sell us one (messed up, but > there it is). We really need to get that box fully shut down, unracked, and > thrown out. > > These were huge blows to Apache Hadoop, as one of the common >> complaints amongst committers is the lack of resources to do cross platform >> testing. Given the ASF had that infrastructure in place, being in this >> position was kind of dumb of the project. Now the machines are gone and as >> a result, the portability of the code is still very hit or miss and the ASF >> is worse for it. >> > > Apache Hadoop is worse for it. As Gavin has noted, just in the past year, > we've increased our build farm dramatically. I believe the ASF is better > for it. We also have a team better focused to support the growth of the ASF. > > We can all agree that turning off services sucks for some projects and > people. But our growth has made demands upon the Foundation and its Infra > team that have forced our hand. We also have a funding model that just > doesn't support us hiring a team large enough to retain the disparate array > of services that we offered in the past. > > >> Since that time, I've helped get the PowerPC build up and running, >> but that's been about it... and even then, I spend little-to-no time on the >> ASF-side of the build bits for those projects I'm interested simply because >> I have no idea if I'll be wasting my time because "whoops, we've changed >> direction again". > > > Again, we'll happily link any buildbot into our buildmaster, so you can > automate builds on your special bots. As you can see from above, we won't > be doing PowerPC. Just Ubuntu for all machines and services from now on. > This allows us (via Puppet) to easily reallocate, move, upgrade, and > maintain our services. Years ago, each machine was manually configured, and > when it went down, the Foundation suffered. Today, if a machine goes down, > we can spin it back up in an hour or two due to the consistency. > > I do sympathize that our service reduction is painful. But I hope you can > understand where the Foundation (and its Infra team) is coming from. We > have vastly more projects to support today, meaning more uniformity is > required. > > Sincerely, > Greg Stein, > Infrastructure Administrator, ASF