(Top-posting a question that rewinds this thread a bit. Feel to continue other discussion on the latest inline email)
Why do so many tools require write access? It seems like there's at least *some* part of this that is a technical limitation... dare I say "error"? My years-stale understanding (from reviewable.io and codecov.io IIRC, both of which I would have loved to use but couldn't, and not just on ASF repos) was that the limitation was GitHub's ACLs were too coarse-grained. Is this still true? Do they know this is a big problem? Are they leaving things as-is deliberately or through lack of funding? OTOH my understanding of other tools (prow? Beam's defunct mergebot?) is that the tool itself really wants to manage the repo for you, queuing up merges and doing them, etc. I don't really know buildkite. It might be helpful to have a table on a wiki of where these tools fail the policies. Technical opinion: in normal git workflow as I see it, any person or *tool* that wishes to create a branch can do so in its own fork. Wanting to write to a branch in some other person's or org's fork is like wanting to write to their hard drive: there are reasons, but doing so has to be inextricable from your core functionality, or you are probably doing it wrong. Over the years, I've felt this pain of CI tools not being able to be used, but I have almost universally considered the *other* party to be the source of the pain, not ASF's very reasonable policies. Is ASF able to influence their roadmaps, or at least keep in touch about them? A combination of best practices amongst projects and tools that understand the whole point of git would go a long way. (I welcome opinions that I am just wrong and these CI tools are doing exactly the best thing they should be doing - that would be new and useful info for me) Kenn On Mon, Feb 3, 2020 at 7:50 PM David Nalley <da...@gnsa.us> wrote: > On Tue, Feb 4, 2020 at 4:29 AM Alex Harui <aha...@adobe.com.invalid> > wrote: > > > > Hopefully last set of questions for now... > > Just wait, the rabbit hole gets deeper :) > > > > > 1) It sounds like there is a risk that as the ASF grows, GH may not be > able to grow with us. Did I understand that correctly? > > GH CI may not be willing to continue giving us free usage. The current > free usage we have is limited, but they are willing to augment - to > what degree we aren't sure yet. We're talking with Github. > Github the VCS will always be free (at least for all versions of the > future that I can foresee short of Github being shuttered) > > > > 2) If we have money to offer GH, why can't we offer money to the CI > Vendors so we aren't really abusing their free tiers? > > We currently pay one CI vendor (Travis - the only one aside from GH > that doesn't need write access. We pay them 12k a year, and are > planning on increasing that spend in next years budget. > We've discussed paying or getting cloud credits from both Azure and > AWS - but ran into the write access problem. > We're currently discussing with GH getting credits or paying them for > more Github Actions capacity. > > > 3) Does GH track my activity in the ASF GH repos as part of the API > usage for Apache? IOW, am I adding to the ASF API count by closing an > issue on github.com? Or if I ran a script on my computer that closed the > issue by using their API? > > No, it's tied to our user/IP address. Your actions likely won't come > close to our complex usage. > > > > I think builds.a.o is a great free service, but AIUI, the > no-third-party-write-access rule is independent of whether CI is free or > not. I cannot pay money and get write-access to the ASF repos. So I think > I'm trying to see if there is a solution even if it did cost money. > > > > I should have been more explicit - we aren't opposed to spending money > on this, and do already spend some money. I'm worried that there is no > limit to the money that could be spent - particularly when people > don't have good insight into what their builds might cost the > Foundation. So for instance, there was a project at the ASF that > consumed 900 dollars/month of our 1000/month spend with Travis. They > didn't realize that they were consuming so much. They also didn't > realize that other projects were feeling the pain - they had optimized > their CI builds to execute really fast in Travis - essentially > concurrently consuming every builder. But the reality is that some > projects need more resources than others and allocating resources > appropriately becomes quite the challenge. > > > Thanks in advance, > > -Alex > > > > On 2/3/20, 7:03 PM, "David Nalley" <da...@gnsa.us> wrote: > > > > On Tue, Feb 4, 2020 at 3:56 AM Alex Harui <aha...@adobe.com.invalid> > wrote: > > > > > > Some questions inline. Apologies in advance for not really > understanding this stuff. I'm primarily a client-side developer. My > projects do not have automated PR testing at this point in time. I'm > mainly exploring in case we become popular enough some day to need it. > > > > > > My line of thinking is that MS has, at least for now, generously > provided free Azure VMs to ASF committers. If N committers from a project > each get a VM, run CI on it, figure out some way to distribute PRs to those > VMs, is there a viable workflow? > > > > > > On 2/3/20, 6:38 PM, "David Nalley" <da...@gnsa.us> wrote: > > > > > > Hi Alex, > > > > > > So this was explored. It creates some problems - first double > the > > > administration overhead - most of that is automated, but it > means that > > > our API usage doubles, and we're already hitting limits from > Github. > > > > > > Is that a max-traffic limit or a limit on traffic before we have > to start paying for usage? > > > > Max number of calls - and we've tried offering up money, they don't > > offer a product with more API calls. Greg has even raised this issue > > all the way to the CEO of Github. > > > > > > > > Second - at least one CI vendor thanked us for not doing that > exactly > > > - because the 'best' way to do it is to create an org per > project or > > > org per repo - and then the free tier is dedicated to that > org. Except > > > that's essentially abusing their free tier. > > > > > > Is "best" defined as lowest cost to the CI vendor or something > else? What would the "second-best" scenario look like if there is one? > > > > Best - well it's the cheapest for us, and it gives the most control > to > > the projects. So great from that perspective, but likely a bit > > unethical and abusive. It's essentially abusing all of the CI vendors > > generosity by horizontally scaling our consumption of their freebies > > and using them per-repo or per project instead of per organization. > > > > > > > > > > Finally - from a practical perspective, if everyone submits > PRs and > > > does testing against this apacheci org - that has become the > de facto > > > repo - it's where everyone is doing their work, and it makes > > > provenance tracking. > > > > > > Didn't the ASF have read-only mirrors of repos? I think it led to > some confusion, but I think folks still figured out. > > > > > > > Not anymore. > > We have an active-active copy of the repositories. People can > actively > > commit against either our repos or the GH repos, and we magically > move > > commits between the two. (There's an upcoming blog post on how all of > > this magic works) > > > > > As an aside - the mandate for no write access is not an > infrastructure > > > policy, it's a legal affairs requirement - we're merely > implementing > > > it. > > > > > > --David > > > > > > On Tue, Feb 4, 2020 at 3:24 AM Alex Harui > <aha...@adobe.com.invalid> wrote: > > > > > > > > Moving board@ to BCC. Attempting to move discussion to > builds@ > > > > > > > > I’m fine with the ASF maintaining its position on stricter > provenance and therefore disallowing third-party write-access to repos. > > > > > > > > A suggestion was made, if I understood it correctly, to > create a whole other set of repos that could be written to by > third-parties. Would such a thing work? Then a committer would have to > manually bring commits back from that other set to the canonical repo. > That seems viable to me. > > > > > > > > A concern was raised that the project might cut its release > from the “other set”, but IMO, that would be ok if the release artifacts > could be verified, which should be possible by comparing the canonical repo > against the “other repo”, at least for the source package, and if there are > reproducible binaries, for the binary artifacts as well. > > > > > > > > Thoughts? > > > > -Alex > > > > > > > > From: Greg Stein <gst...@gmail.com> > > > > Reply-To: "bo...@apache.org" <bo...@apache.org> > > > > Date: Monday, February 3, 2020 at 5:17 PM > > > > To: "bo...@apache.org" <bo...@apache.org> > > > > Subject: Re: [CI] What are the troubles projects face with > CI and Infra > > > > > > > > On Mon, Feb 3, 2020 at 6:48 PM Alex Harui <aha...@adobe.com > <mailto:aha...@adobe.com>> wrote: > > > > >... > > > > How does Google or other non-ASF open source projects manage > the provenance tracking? > > > > > > > > Note that most F/OSS projects don't worry about provenance > to the level the Foundation worries. That affords them some flexibility > that our choices do not allow. Those projects may also choose to trust > tools with write access to their repositories, hoping they will not Do > Something Bad(tm). We have chosen to not provide that trust. > > > > > > > > IMO, I do not think the Foundation should relax its stance > on provenance, nor trust in third parties ... but that is one of the key > considerations [for the Board] at the heart of being able to leverage some > third party CI/CD services. > > > > > > > > Cheers, > > > > -g > > > > > > > > > > > > > > >