Re: Turning Off Bb-fbsd2

2017-02-09 Thread Dave Fisher
Thanks for the explanation. Is there documentation anywhere about Apache 
infrastructure's standards and requirements for external slaves?

Regards,
Dave

Sent from my iPhone

> On Feb 9, 2017, at 4:32 PM, Greg Stein  wrote:
> 
> On Thu, Feb 9, 2017 at 5:53 PM, Allen Wittenauer 
> wrote:
>> ...
> 
>>The Mac OS X host was shut down literally a day after I sent out
>> an email to common-dev@hadoop announcing I had full build and patch
>> testing working.  I had spent quite a bit of time getting Apache Yetus
>> ported over to work on Apache's OS X machine, then spent over a month on
>> working out the Hadoop specifics, running build after build after build.
>> Competing with the Apache Mesos jobs that also ran on that box. The reason
>> I was told it was killed was: "no one was using it".  (Umm, what?  Clearly
>> no one bothered looking at the build log.)
>> 
> 
> This occurred before I started working as the Infrastructure Administrator
> (last Fall). I don't know the full background, other than a PMC requested
> that buildbot, then never used it. Yeah: maybe the build logs weren't
> examined to see that other projects had hopped onto it.
> 
> I also believe we had to pay for that box, and it wasn't cheap.
> 
> Today, our preferred model for non-Ubuntu boxes is to have other people
> own/run/manage those buildbots and hook them into our buildmaster. For
> example, people on the Apache Subversion project have several such 'bots.
> 
> We are concentrating our in-house experience on the Ubuntu platform, from
> both an operational and a cost angle. Four years ago, the Infra team had
> many fewer projects to support. Today, we have hundreds of projects and
> many thousands of committers to support. We've had to reallocate in order
> to meet the incredible growth of the ASF.
> 
> Unfortunately, especially for yourself and some others, the "smoothing down
> the edges" has been detrimental.
> 
>In parallel, I started working on the Solaris box which was
>> then promptly shutdown not too long after I had filed a jira to see if we
>> could get the base CA certificates upgraded. (which was pretty much all I
>> needed, after that I could have finished getting the Hadoop builds working
>> on it as well).
>> 
> 
> We're still shutting down Solaris. Only one guy has experience with it, and
> he's also got a ton of other stuff to do.
> 
> Our hardware that runs Solaris is also *very* old. Worse: we could never
> get a support contract for it. They wouldn't sell us one (messed up, but
> there it is). We really need to get that box fully shut down, unracked, and
> thrown out.
> 
>These were huge blows to Apache Hadoop, as one of the common
>> complaints amongst committers is the lack of resources to do cross platform
>> testing. Given the ASF had that infrastructure in place, being in this
>> position was kind of dumb of the project.  Now the machines are gone and as
>> a result, the portability of the code is still very hit or miss and the ASF
>> is worse for it.
>> 
> 
> Apache Hadoop is worse for it. As Gavin has noted, just in the past year,
> we've increased our build farm dramatically. I believe the ASF is better
> for it. We also have a team better focused to support the growth of the ASF.
> 
> We can all agree that turning off services sucks for some projects and
> people. But our growth has made demands upon the Foundation and its Infra
> team that have forced our hand. We also have a funding model that just
> doesn't support us hiring a team large enough to retain the disparate array
> of services that we offered in the past.
> 
> 
>>Since that time, I've helped get the PowerPC build up and running,
>> but that's been about it... and even then, I spend little-to-no time on the
>> ASF-side of the build bits for those projects I'm interested simply because
>> I have no idea if I'll be wasting my time because "whoops, we've changed
>> direction again".
> 
> 
> Again, we'll happily link any buildbot into our buildmaster, so you can
> automate builds on your special bots. As you can see from above, we won't
> be doing PowerPC. Just Ubuntu for all machines and services from now on.
> This allows us (via Puppet) to easily reallocate, move, upgrade, and
> maintain our services. Years ago, each machine was manually configured, and
> when it went down, the Foundation suffered. Today, if a machine goes down,
> we can spin it back up in an hour or two due to the consistency.
> 
> I do sympathize that our service reduction is painful. But I hope you can
> understand where the Foundation (and its Infra team) is coming from. We
> have vastly more projects to support today, meaning more uniformity is
> required.
> 
> Sincerely,
> Greg Stein,
> Infrastructure Administrator, ASF



Re: Turning Off Bb-fbsd2

2017-02-14 Thread Dave Fisher
Hi Gavin,

Perfect!

Thank you very much for your dedicated support of the ASF!

Regards,
Dave

Sent from my iPhone

> On Feb 14, 2017, at 4:20 PM, Gavin McDonald  wrote:
> 
> 
>> On 10 Feb 2017, at 11:46 am, Dave Fisher  wrote:
>> 
>> Thanks for the explanation. Is there documentation anywhere about Apache 
>> infrastructure's standards and requirements for external slaves?
> 
> There wasn’t one  but now there is ;)
> 
> https://reference.apache.org/committer/node-hosting 
> 
> Please let me know if you have any further questions.
> 
> Gav…
> 
> 
> 
> 
> 
>> 
>> Regards,
>> Dave
>> 
>> Sent from my iPhone
>> 
>>> On Feb 9, 2017, at 4:32 PM, Greg Stein  wrote:
>>> 
>>> On Thu, Feb 9, 2017 at 5:53 PM, Allen Wittenauer 
>>> 
>>> wrote:
>>>> ...
>>> 
>>>>  The Mac OS X host was shut down literally a day after I sent out
>>>> an email to common-dev@hadoop announcing I had full build and patch
>>>> testing working.  I had spent quite a bit of time getting Apache Yetus
>>>> ported over to work on Apache's OS X machine, then spent over a month on
>>>> working out the Hadoop specifics, running build after build after build.
>>>> Competing with the Apache Mesos jobs that also ran on that box. The reason
>>>> I was told it was killed was: "no one was using it".  (Umm, what?  Clearly
>>>> no one bothered looking at the build log.)
>>>> 
>>> 
>>> This occurred before I started working as the Infrastructure Administrator
>>> (last Fall). I don't know the full background, other than a PMC requested
>>> that buildbot, then never used it. Yeah: maybe the build logs weren't
>>> examined to see that other projects had hopped onto it.
>>> 
>>> I also believe we had to pay for that box, and it wasn't cheap.
>>> 
>>> Today, our preferred model for non-Ubuntu boxes is to have other people
>>> own/run/manage those buildbots and hook them into our buildmaster. For
>>> example, people on the Apache Subversion project have several such 'bots.
>>> 
>>> We are concentrating our in-house experience on the Ubuntu platform, from
>>> both an operational and a cost angle. Four years ago, the Infra team had
>>> many fewer projects to support. Today, we have hundreds of projects and
>>> many thousands of committers to support. We've had to reallocate in order
>>> to meet the incredible growth of the ASF.
>>> 
>>> Unfortunately, especially for yourself and some others, the "smoothing down
>>> the edges" has been detrimental.
>>> 
>>>  In parallel, I started working on the Solaris box which was
>>>> then promptly shutdown not too long after I had filed a jira to see if we
>>>> could get the base CA certificates upgraded. (which was pretty much all I
>>>> needed, after that I could have finished getting the Hadoop builds working
>>>> on it as well).
>>>> 
>>> 
>>> We're still shutting down Solaris. Only one guy has experience with it, and
>>> he's also got a ton of other stuff to do.
>>> 
>>> Our hardware that runs Solaris is also *very* old. Worse: we could never
>>> get a support contract for it. They wouldn't sell us one (messed up, but
>>> there it is). We really need to get that box fully shut down, unracked, and
>>> thrown out.
>>> 
>>>  These were huge blows to Apache Hadoop, as one of the common
>>>> complaints amongst committers is the lack of resources to do cross platform
>>>> testing. Given the ASF had that infrastructure in place, being in this
>>>> position was kind of dumb of the project.  Now the machines are gone and as
>>>> a result, the portability of the code is still very hit or miss and the ASF
>>>> is worse for it.
>>>> 
>>> 
>>> Apache Hadoop is worse for it. As Gavin has noted, just in the past year,
>>> we've increased our build farm dramatically. I believe the ASF is better
>>> for it. We also have a team better focused to support the growth of the ASF.
>>> 
>>> We can all agree that turning off services sucks for some projects and
>>> people. But our growth has made demands upon the Foundation and its Infra
>>> team that have forced our hand. We also have a funding model that just
>>> doesn't support us hiring a team large enough to retain the disparate

Re: Can we package release artifacts on builds.a.o?

2019-01-06 Thread Dave Fisher



Sent from my iPhone

> On Jan 6, 2019, at 7:53 PM, Roman Shaposhnik  wrote:
> 
>> On Sun, Jan 6, 2019 at 7:38 PM Alex Harui  wrote:
>> 
>> 
>> 
>> On 1/6/19, 6:58 PM, "Roman Shaposhnik"  wrote:
>> 
>>>On Sun, Jan 6, 2019 at 6:50 PM Alex Harui  
>>> wrote:
>>> 
>>> OK, apparently Infra doesn't want to discuss this in a JIRA issue so I will 
>>> try to continue it here and bug people with emails if the thread stagnates 
>>> like it did last time.
>>> 
>>> I'm unclear what questions and problems are of concern here specific to 
>>> this ask.  IMO:
>>> 1) ASF Release Policy currently allows artifacts to be packaged on other 
>>> hardware.  It just has to be verified on RM/PMC-controlled hardware
>>> 2) There is no packaging specific security risk.  Rogue executions via 
>>> Jenkins are either possible or not possible and there are plenty of other 
>>> juicy targets for rogue executions besides release artifacts that are 
>>> verifiable.
>> 
>>I don't have a strong opinion on the above, but I'm very concerned
>>about a requirement of a bot pushing to SCM repos.
>> 
>> Please explain your concern.
> 
> ASF lives and dies by how well it can track IP provenance in what we release.
> That's why any non-committer interactions around SCM will give me pause.

Releases are explicitly approved by a PMC. How can the build system results be 
approved by the PMC? Safely and confidently?

> 
>> A bot is already allowed to commit to the website repos, AIUI.
> 
> Two things:
>   1. can you give me real-world examples of that?

Website publishing is not an act of the whole PMC. It can be triggered on 
commit / done as a committer’s act.

>   2. website repos are much lower on my list of priorities than code
> repos (see above for reasoning)

Agreed. I see that this question of “Release” to be  worth discussion. IMO what 
we are really discussing is automatically releasing build system produced 
convenience binaries.

Can we allow build system produced convenience binaries? If so must we hold 
votes? If not then what level of scrutiny must the PMC provide?

Regards,
Dave

> 
> Thanks,
> Roman.



Re: Can we package release artifacts on builds.a.o?

2019-01-07 Thread Dave Fisher
Hi Chris,

Thank you for providing Carlos with instructions. Was that on or off list?

Regards,
Dave

Sent from my iPhone

> On Jan 7, 2019, at 1:18 PM, Christofer Dutz  wrote:
> 
> Hi Alex,
> 
> Ways to do bad stuff with just a pom.xml:
> - simply adding a dependency to a vulnerable library, even an intentionally 
> staged malicious one.
> - Adding an evec-maven-plugin to execute anything on the host machine
> - Generate code
> - Like I introduced into the FlexJS maven build: Patch/Modify source files
> Guess the is what I could think of in 5 minutes...
> 
> Ways to do bad stuff by just changing one-line versions:
> And changing the version of a dependency to a known vulnerable version would 
> be such a one-liner.
> I'm currently introducing vulnerability checks into all of my builds, so I'm 
> bumping dependencies to unvulnerable versions ... 
> doing this the other way around would introduce vulnerabilities.
> 
> Connectivity problems:
> Regarding network problems ... on my way to Montreal I staged the first RC 
> for Apache PLC4X in a plane ... 
> it took about 3 hours to upload cause of network problems and latencies. 
> Maven usually works around connectivity problems quite nicely and reliably.
> 
> So all in all I would suggest you sort out the problems in the build with 
> someone with experience. 
> I already told Carlos how he could deploy to a local directory during the 
> release itself and then use another
> plugin to stage that release independently. 
> If it aborts, you just re-start the deployment and close the repo as soon as 
> all passes.
> 
> Chris
> 
> 
> 
> Am 07.01.19, 19:39 schrieb "Alex Harui" :
> 
>Hi Greg,
> 
>Thanks for the history.  I agree with the general problem, however, for 
> Royale, I think the problem is constrained, but I could be wrong.  I don't 
> think there are exploits from things like missing semicolons and other code 
> exploits that can be executed against pom.xml files, so the Royale reviewers 
> are first looking to see if bot changed any other files.  Maybe Maven experts 
> can tell us what kinds of exploit could be hacked into a pom.xml.
> 
>Could you answer another question?  What is the current state of SVN/Git 
> integration?  Could we spin up an SVN clone of our Git repos, restrict the 
> bot via SVN, then sync back from SVN to Git (all from Jenkins)?
> 
>Thanks,
>-Alex
> 
>On 1/7/19, 10:30 AM, "Greg Stein"  wrote:
> 
>On Mon, Jan 7, 2019 at 12:23 PM Alex Harui  
> wrote:
>> ...
> 
>> I still don't get why allowing a bot to commit to a Git repo isn't
>> auditable.  The changes should all be text and sent to commits@ and the
>> RMs job is to verify that those commits are ok before putting the artifacts
>> up for vote.  I'd even try to  make an email rule that checks for commits
>> from buildbot and flags changes to files that are outside of what we
>> expected.
>> 
> 
>The historic position of the Foundation is "no ability to commit 
> without a
>matched ICLA". That is different from "we'll audit any commits made by
>$bot". The trust meter is rather different between those positions,
>specifically with the "what if nobody reviews? what if a commit is 
> missed?
>what if that added semicolon is missed, yet opens a vuln?" ... With the
>"matched ICLA" position, the Foundation has the assurance of *that*
>committer, that everything is Good. ... Yet a bot cannot make any such
>assurances, despite any "best effort" of the PMC to review the bot's 
> work.
> 
>It is likely a solvable problem! My comments here are to outline
>history/policy, rather than to say "NO". These are just the parameters 
> of
>the problem space.
> 
>Cheers,
>-g
>InfraAdmin
> 
> 
> 
> 



Re: Can we package release artifacts on builds.a.o?

2019-01-07 Thread Dave Fisher
 Ich Verstehe dich!

I you share with me do you mind if I share on dev@royale?

Sent from my iPhone

> On Jan 7, 2019, at 1:55 PM, Christofer Dutz  wrote:
> 
> Hi Dave,
> 
> Well it was naturally off list.
> 
> Chris
> 
> Outlook for Android<https://aka.ms/ghei36> herunterladen
> 
> ________
> From: Dave Fisher 
> Sent: Monday, January 7, 2019 10:32:38 PM
> To: builds@apache.org
> Subject: Re: Can we package release artifacts on builds.a.o?
> 
> Hi Chris,
> 
> Thank you for providing Carlos with instructions. Was that on or off list?
> 
> Regards,
> Dave
> 
> Sent from my iPhone
> 
>> On Jan 7, 2019, at 1:18 PM, Christofer Dutz  
>> wrote:
>> 
>> Hi Alex,
>> 
>> Ways to do bad stuff with just a pom.xml:
>> - simply adding a dependency to a vulnerable library, even an intentionally 
>> staged malicious one.
>> - Adding an evec-maven-plugin to execute anything on the host machine
>> - Generate code
>> - Like I introduced into the FlexJS maven build: Patch/Modify source files
>> Guess the is what I could think of in 5 minutes...
>> 
>> Ways to do bad stuff by just changing one-line versions:
>> And changing the version of a dependency to a known vulnerable version would 
>> be such a one-liner.
>> I'm currently introducing vulnerability checks into all of my builds, so I'm 
>> bumping dependencies to unvulnerable versions ...
>> doing this the other way around would introduce vulnerabilities.
>> 
>> Connectivity problems:
>> Regarding network problems ... on my way to Montreal I staged the first RC 
>> for Apache PLC4X in a plane ...
>> it took about 3 hours to upload cause of network problems and latencies.
>> Maven usually works around connectivity problems quite nicely and reliably.
>> 
>> So all in all I would suggest you sort out the problems in the build with 
>> someone with experience.
>> I already told Carlos how he could deploy to a local directory during the 
>> release itself and then use another
>> plugin to stage that release independently.
>> If it aborts, you just re-start the deployment and close the repo as soon as 
>> all passes.
>> 
>> Chris
>> 
>> 
>> 
>> Am 07.01.19, 19:39 schrieb "Alex Harui" :
>> 
>>   Hi Greg,
>> 
>>   Thanks for the history.  I agree with the general problem, however, for 
>> Royale, I think the problem is constrained, but I could be wrong.  I don't 
>> think there are exploits from things like missing semicolons and other code 
>> exploits that can be executed against pom.xml files, so the Royale reviewers 
>> are first looking to see if bot changed any other files.  Maybe Maven 
>> experts can tell us what kinds of exploit could be hacked into a pom.xml.
>> 
>>   Could you answer another question?  What is the current state of SVN/Git 
>> integration?  Could we spin up an SVN clone of our Git repos, restrict the 
>> bot via SVN, then sync back from SVN to Git (all from Jenkins)?
>> 
>>   Thanks,
>>   -Alex
>> 
>>   On 1/7/19, 10:30 AM, "Greg Stein"  wrote:
>> 
>>   On Mon, Jan 7, 2019 at 12:23 PM Alex Harui  
>> wrote:
>>> ...
>> 
>>> I still don't get why allowing a bot to commit to a Git repo isn't
>>> auditable.  The changes should all be text and sent to commits@ and the
>>> RMs job is to verify that those commits are ok before putting the artifacts
>>> up for vote.  I'd even try to  make an email rule that checks for commits
>>> from buildbot and flags changes to files that are outside of what we
>>> expected.
>>> 
>> 
>>   The historic position of the Foundation is "no ability to commit 
>> without a
>>   matched ICLA". That is different from "we'll audit any commits made by
>>   $bot". The trust meter is rather different between those positions,
>>   specifically with the "what if nobody reviews? what if a commit is 
>> missed?
>>   what if that added semicolon is missed, yet opens a vuln?" ... With the
>>   "matched ICLA" position, the Foundation has the assurance of *that*
>>   committer, that everything is Good. ... Yet a bot cannot make any such
>>   assurances, despite any "best effort" of the PMC to review the bot's 
>> work.
>> 
>>   It is likely a solvable problem! My comments here are to outline
>>   history/policy, rather than to say "NO". These are just the parameters 
>> of
>>   the problem space.
>> 
>>   Cheers,
>>   -g
>>   InfraAdmin
>> 
>> 
>> 
>> 
> 



Re: Can we package release artifacts on builds.a.o?

2019-01-10 Thread Dave Fisher
Hi -

Sent from my iPhone

> On Jan 10, 2019, at 4:18 PM, Roman Shaposhnik  wrote:
> 
>> On Thu, Jan 10, 2019 at 12:45 AM Alex Harui  wrote:
>> 
>> 
>> 
>> On 1/9/19, 7:35 PM, "Roman Shaposhnik"  wrote:
>> 
>>>On Wed, Jan 9, 2019 at 11:38 AM Alex Harui  
>>> wrote:
>>> 
>>> Hi Greg,
>>> 
>>> You may have missed some other infra-technical questions upthread that 
>>> might help us fashion a solution.  I'll repeat them here:
>>> 
>>> 1) What is the state of Git->SVN and SVN->Git integration?  Could our job 
>>> clone git to SVN, have the bot make changes in SVN with the additional 
>>> restrictions as you said SVN could do, then sync back up to Git (including 
>>> tags as well)?
>>> 2) What would be the impact of infra creating a "RoyalePMC" committer 
>>> account?
>> 
>>That is definitely not allowed. PMC members are expected to be human
>>beings with ICLAs on file with ASF.
>> 
>> The only allowed users of the RoyalePMC account would be human PMC
>> members (technically, anyone with access to private@royale).
>> Commits from RoyalePMC would therefore have somebody's ICLA behind it.

As a Royale PMC member I’m not comfortable with being on this hook.

> 
> Then just do that from under individual accounts.

+1

I would be comfortable with Approved PMC member credentials like are used for 
the handful of VM sysadmins from the OpenOffice PMC.

> 
>> 
>>In fact, I would go as far as to say that any PMC member willingly
>>disseminating his or her credentials for *others* to use is likely to
>>be considered for a an action from the board.
>> 
>> I would agree that PMC members should not share their credentials with 
>> others,
>> hence the idea of having a RoyalePMC account, so no human has to share or
>> transfer credentials to the build machine.   Is it important to know exactly 
>> which
>> individual committed something or just that somebody with an ICLA committed 
>> something, and why?
> 
> It is very important to establish IP provenance down to an individual.

And accountability.

Regards,
Dave


> Thanks,
> Roman.



Re: Jenkins Build for Heron

2019-04-23 Thread Dave Fisher
Hi -

I am a mentor to Heron and am following this request. Does the website Jenkins 
box have the software and if so, what version?

If there is a quick way to query that information that would be great! If not 
then we figure a way to look.

> On Apr 23, 2019, at 11:50 AM, Josh Fischer  wrote:
> 
> I am one of the committers on the incubating project Heron.  I am looking
> to create a Jenkins job that will be triggered on commit's to the
> "asf-site" branch to build and deploy our static assets and I have some
> questions.
> 
> 1. Does the Jenkins box have the build tools listed below already?  Or do
> you think it would be better if I downloaded and installed in the workspace
> for each build?
> 
> 2. Where would I put the static files to be served?  I'm assuming there is
> something already pre-defined in the jenkins box that I can re-use?
> 
> 
> The requirements for building our site  are as follows: (I copied our setup
> script directly  to make sure I didn't miss anything).  I hope this is
> enough detail, please let me know.
> 
> A quick overview is:
> 
> 
>   - Make 
>   - Hugo
>   - GulpJs
>   - Node.js 
>   - npm 
>   - pip  - install PyYAML>=3.12
>   - Go  (make sure that your GOPATH and GOROOT are
>   set)
>   - Java 8
>   - Bazel 0.23
> 
> 
> PLATFORM=`platform`
> if [ $PLATFORM = darwin ]; then
> go get -v github.com/gohugoio/hugo
> which wget || brew install wget
> elif [ $PLATFORM = ubuntu ]; then
> sudo apt-get install golang git mercurial -y
> export GOROOT=/usr/lib/go
> export GOPATH=$HOME/go
> export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
> go get -u -v github.com/spf13/hugo
> elif [ $PLATFORM = centos ]; then
> sudo yum -y install nodejs npm golang --enablerepo=epel
> export GOROOT=/usr/lib/go
> export GOPATH=$HOME/go
> export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
> go get -u -v github.com/spf13/hugo
> fi
> npm install
> sudo -H pip uninstall -y pygments
> sudo -H pip install pygments==2.1.3 pdoc==0.3.2
> Please Advise,
> 

Thanks
Regards,
Dave

> Josh
> 
> On Sat, Apr 13, 2019 at 7:47 PM Josh Fischer  wrote:
> 
>> Hi,
>> 
>> I am one of the committers on the incubating project Heron.  I am looking
>> to create a Jenkins job that will be triggered on commit's to the
>> "asf-site" branch to build and deploy our static assets.  I'd like to check
>> if the Jenkins box supports what we will need for building our site as well
>> as get some guidance to where and how I will place the static assets to be
>> served for our site.
>> 
>> 
>> The requirements for building our site  are as follows: (I copied our
>> setup script directly  to make sure I didn't miss anything).  I hope this
>> is enough detail, please let me know.
>> 
>> A quick overview is:
>> 
>> 
>>   - Make 
>>   - Node.js 
>>   - npm 
>>   - pip  - install PyYAML>=3.12
>>   - Go  (make sure that your GOPATH and GOROOT are
>>   set)
>>   - Java 8
>>   - Bazel 0.23
>> 
>> 
>> PLATFORM=`platform`
>> if [ $PLATFORM = darwin ]; then
>> go get -v github.com/gohugoio/hugo
>> which wget || brew install wget
>> elif [ $PLATFORM = ubuntu ]; then
>> sudo apt-get install golang git mercurial -y
>> export GOROOT=/usr/lib/go
>> export GOPATH=$HOME/go
>> export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
>> go get -u -v github.com/spf13/hugo
>> elif [ $PLATFORM = centos ]; then
>> sudo yum -y install nodejs npm golang --enablerepo=epel
>> export GOROOT=/usr/lib/go
>> export GOPATH=$HOME/go
>> export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
>> go get -u -v github.com/spf13/hugo
>> fi
>> npm install
>> sudo -H pip uninstall -y pygments
>> sudo -H pip install pygments==2.1.3 pdoc==0.3.2
>> 



Re: Jenkins Build for Heron

2019-04-23 Thread Dave Fisher
Hi Gavin,

Thanks!

Josh - when you create the job under a Heron tab make sure that just after the 
JDK selection you check the “Restrict where this project can be run to the 
“Label Expression” = git-websites

You can play with the shell script to look at what is where on the git-websites 
box.

Let me know on dev@heron if you want to discuss the Incubator site as an 
example.

Regards,
Dave

> On Apr 23, 2019, at 12:44 PM, Gavin McDonald  wrote:
> 
> Hi All,
> 
> gulp and hugo should be installed on the websites jenkins node shortly
> 
> HTH
> 
> Gav...
> 
> 
> On Tue, Apr 23, 2019 at 8:38 PM Dave Fisher  wrote:
>> 
>> Hi -
>> 
>> I am a mentor to Heron and am following this request. Does the website
> Jenkins box have the software and if so, what version?
>> 
>> If there is a quick way to query that information that would be great! If
> not then we figure a way to look.
>> 
>>> On Apr 23, 2019, at 11:50 AM, Josh Fischer  wrote:
>>> 
>>> I am one of the committers on the incubating project Heron.  I am
> looking
>>> to create a Jenkins job that will be triggered on commit's to the
>>> "asf-site" branch to build and deploy our static assets and I have some
>>> questions.
>>> 
>>> 1. Does the Jenkins box have the build tools listed below already?  Or
> do
>>> you think it would be better if I downloaded and installed in the
> workspace
>>> for each build?
>>> 
>>> 2. Where would I put the static files to be served?  I'm assuming there
> is
>>> something already pre-defined in the jenkins box that I can re-use?
>>> 
>>> 
>>> The requirements for building our site  are as follows: (I copied our
> setup
>>> script directly  to make sure I didn't miss anything).  I hope this is
>>> enough detail, please let me know.
>>> 
>>> A quick overview is:
>>> 
>>> 
>>>  - Make <https://www.gnu.org/software/make/>
>>>  - Hugo
>>>  - GulpJs
>>>  - Node.js <https://nodejs.org/en/>
>>>  - npm <https://www.npmjs.com/>
>>>  - pip <https://pypi.python.org/pypi/pip> - install PyYAML>=3.12
>>>  - Go <https://golang.org/> (make sure that your GOPATH and GOROOT are
>>>  set)
>>>  - Java 8
>>>  - Bazel 0.23
>>> 
>>> 
>>> PLATFORM=`platform`
>>> if [ $PLATFORM = darwin ]; then
>>> go get -v github.com/gohugoio/hugo
>>> which wget || brew install wget
>>> elif [ $PLATFORM = ubuntu ]; then
>>> sudo apt-get install golang git mercurial -y
>>> export GOROOT=/usr/lib/go
>>> export GOPATH=$HOME/go
>>> export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
>>> go get -u -v github.com/spf13/hugo
>>> elif [ $PLATFORM = centos ]; then
>>> sudo yum -y install nodejs npm golang --enablerepo=epel
>>> export GOROOT=/usr/lib/go
>>> export GOPATH=$HOME/go
>>> export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
>>> go get -u -v github.com/spf13/hugo
>>> fi
>>> npm install
>>> sudo -H pip uninstall -y pygments
>>> sudo -H pip install pygments==2.1.3 pdoc==0.3.2
>>> Please Advise,
>>> 
>> 
>> Thanks
>> Regards,
>> Dave
>> 
>>> Josh
>>> 
>>> On Sat, Apr 13, 2019 at 7:47 PM Josh Fischer 
> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I am one of the committers on the incubating project Heron.  I am
> looking
>>>> to create a Jenkins job that will be triggered on commit's to the
>>>> "asf-site" branch to build and deploy our static assets.  I'd like to
> check
>>>> if the Jenkins box supports what we will need for building our site as
> well
>>>> as get some guidance to where and how I will place the static assets
> to be
>>>> served for our site.
>>>> 
>>>> 
>>>> The requirements for building our site  are as follows: (I copied our
>>>> setup script directly  to make sure I didn't miss anything).  I hope
> this
>>>> is enough detail, please let me know.
>>>> 
>>>> A quick overview is:
>>>> 
>>>> 
>>>>  - Make <https://www.gnu.org/software/make/>
>>>>  - Node.js <https://nodejs.org/en/>
>>>>  - npm <https://www.npmjs.com/>
>>>>  - pip <https://pypi.python.org/pypi/pip> - install PyYAML>=3.12
>>>>  - Go <https://golang.org/> (make sure that your GOPATH and GOROOT are
>>>>  set)
>>>>  - Java 8
>>>>  - Bazel 0.23
>>>> 
>>>> 
>>>> PLATFORM=`platform`
>>>> if [ $PLATFORM = darwin ]; then
>>>> go get -v github.com/gohugoio/hugo
>>>> which wget || brew install wget
>>>> elif [ $PLATFORM = ubuntu ]; then
>>>> sudo apt-get install golang git mercurial -y
>>>> export GOROOT=/usr/lib/go
>>>> export GOPATH=$HOME/go
>>>> export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
>>>> go get -u -v github.com/spf13/hugo
>>>> elif [ $PLATFORM = centos ]; then
>>>> sudo yum -y install nodejs npm golang --enablerepo=epel
>>>> export GOROOT=/usr/lib/go
>>>> export GOPATH=$HOME/go
>>>> export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
>>>> go get -u -v github.com/spf13/hugo
>>>> fi
>>>> npm install
>>>> sudo -H pip uninstall -y pygments
>>>> sudo -H pip install pygments==2.1.3 pdoc==0.3.2
>>>> 
>> 
> 
> 
> --
> Gav...



Re: Jenkins Build for Heron

2019-04-26 Thread Dave Fisher
Hi Josh,

You need to be added to the  hudson-jobadmin LDAP group. Only PMC Chairs can do 
this.

Is there a PMC Chair listening here who could grant Josh Karma?

Regards,
Dave

> On Apr 26, 2019, at 3:50 AM, Josh Fischer  wrote:
> 
> 
> I was looking around the site https://builds.apache.org/view/Heron%20Jobs/ . 
> I've logged in, but I don't see a way to create a freestyle 
> project/pipeline/anything in the UI.   Am I missing something obvious? 
> 
>  Please see the google drive link below for the image.
> 
> https://drive.google.com/file/d/1BGjltTiRNZRfBaT5d2heTKEfKfUe0Irz/view?usp=sharing
> 
> 
> On Tue, Apr 23, 2019 at 7:02 PM Josh Fischer  wrote:
> Thanks for the help Gavin and Dave.  I’m sure I will have some questions as I 
> go.  Will start to understand the process more tonight.  I’ll follow up with 
> a status update  and questions to dev@heron.
> 
> 
> On Tue, Apr 23, 2019 at 5:36 PM Dave Fisher  wrote:
> Hi Gavin,
> 
> Thanks!
> 
> Josh - when you create the job under a Heron tab make sure that just after 
> the JDK selection you check the “Restrict where this project can be run to 
> the “Label Expression” = git-websites
> 
> You can play with the shell script to look at what is where on the 
> git-websites box.
> 
> Let me know on dev@heron if you want to discuss the Incubator site as an 
> example.
> 
> Regards,
> Dave
> 
> > On Apr 23, 2019, at 12:44 PM, Gavin McDonald  wrote:
> > 
> > Hi All,
> > 
> > gulp and hugo should be installed on the websites jenkins node shortly
> > 
> > HTH
> > 
> > Gav...
> > 
> > 
> > On Tue, Apr 23, 2019 at 8:38 PM Dave Fisher  wrote:
> >> 
> >> Hi -
> >> 
> >> I am a mentor to Heron and am following this request. Does the website
> > Jenkins box have the software and if so, what version?
> >> 
> >> If there is a quick way to query that information that would be great! If
> > not then we figure a way to look.
> >> 
> >>> On Apr 23, 2019, at 11:50 AM, Josh Fischer  wrote:
> >>> 
> >>> I am one of the committers on the incubating project Heron.  I am
> > looking
> >>> to create a Jenkins job that will be triggered on commit's to the
> >>> "asf-site" branch to build and deploy our static assets and I have some
> >>> questions.
> >>> 
> >>> 1. Does the Jenkins box have the build tools listed below already?  Or
> > do
> >>> you think it would be better if I downloaded and installed in the
> > workspace
> >>> for each build?
> >>> 
> >>> 2. Where would I put the static files to be served?  I'm assuming there
> > is
> >>> something already pre-defined in the jenkins box that I can re-use?
> >>> 
> >>> 
> >>> The requirements for building our site  are as follows: (I copied our
> > setup
> >>> script directly  to make sure I didn't miss anything).  I hope this is
> >>> enough detail, please let me know.
> >>> 
> >>> A quick overview is:
> >>> 
> >>> 
> >>>  - Make <https://www.gnu.org/software/make/>
> >>>  - Hugo
> >>>  - GulpJs
> >>>  - Node.js <https://nodejs.org/en/>
> >>>  - npm <https://www.npmjs.com/>
> >>>  - pip <https://pypi.python.org/pypi/pip> - install PyYAML>=3.12
> >>>  - Go <https://golang.org/> (make sure that your GOPATH and GOROOT are
> >>>  set)
> >>>  - Java 8
> >>>  - Bazel 0.23
> >>> 
> >>> 
> >>> PLATFORM=`platform`
> >>> if [ $PLATFORM = darwin ]; then
> >>> go get -v github.com/gohugoio/hugo
> >>> which wget || brew install wget
> >>> elif [ $PLATFORM = ubuntu ]; then
> >>> sudo apt-get install golang git mercurial -y
> >>> export GOROOT=/usr/lib/go
> >>> export GOPATH=$HOME/go
> >>> export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
> >>> go get -u -v github.com/spf13/hugo
> >>> elif [ $PLATFORM = centos ]; then
> >>> sudo yum -y install nodejs npm golang --enablerepo=epel
> >>> export GOROOT=/usr/lib/go
> >>> export GOPATH=$HOME/go
> >>> export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
> >>> go get -u -v github.com/spf13/hugo
> >>> fi
> >>> npm install
> >>> sudo -H pip uninstall -y pygments
> >>> sudo -H pip install pygments==2.1.3 pdoc==0.3.2
> >>> 

Re: AW: Jenkins Build for Heron

2019-04-30 Thread Dave Fisher
Use whimsy.apache.org PPMC roster page.

Thanks,
Dave

Sent from my iPhone

> On Apr 30, 2019, at 12:58 AM, Jan Matèrne (jhm)  wrote:
> 
> I can't find Josh's name on any committer list for Heron.
> https://incubator.apache.org/projects/heron.html
> Haven't found any other committer list ...
> 
> 
> Jan
> 
>> -Ursprüngliche Nachricht-
>> Von: Dave Fisher [mailto:dave2w...@comcast.net]
>> Gesendet: Freitag, 26. April 2019 17:45
>> An: Josh Fischer; Justin Mclean
>> Cc: ipv6g...@gmail.com; builds@apache.org
>> Betreff: Re: Jenkins Build for Heron
>> 
>> Hi Josh,
>> 
>> You need to be added to the  hudson-jobadmin LDAP group. Only PMC
>> Chairs can do this.
>> 
>> Is there a PMC Chair listening here who could grant Josh Karma?
>> 
>> Regards,
>> Dave
>> 
>>> On Apr 26, 2019, at 3:50 AM, Josh Fischer 
>> wrote:
>>> 
>>> 
>>> I was looking around the site
>> https://builds.apache.org/view/Heron%20Jobs/ . I've logged in, but I
>> don't see a way to create a freestyle project/pipeline/anything in the
>> UI.   Am I missing something obvious?
>>> 
>>> Please see the google drive link below for the image.
>>> 
>>> 
>> https://drive.google.com/file/d/1BGjltTiRNZRfBaT5d2heTKEfKfUe0Irz/view
>>> ?usp=sharing
>>> 
>>> 
>>> On Tue, Apr 23, 2019 at 7:02 PM Josh Fischer 
>> wrote:
>>> Thanks for the help Gavin and Dave.  I’m sure I will have some
>> questions as I go.  Will start to understand the process more tonight.
>> I’ll follow up with a status update  and questions to dev@heron.
>>> 
>>> 
>>> On Tue, Apr 23, 2019 at 5:36 PM Dave Fisher 
>> wrote:
>>> Hi Gavin,
>>> 
>>> Thanks!
>>> 
>>> Josh - when you create the job under a Heron tab make sure that just
>>> after the JDK selection you check the “Restrict where this project
>> can
>>> be run to the “Label Expression” = git-websites
>>> 
>>> You can play with the shell script to look at what is where on the
>> git-websites box.
>>> 
>>> Let me know on dev@heron if you want to discuss the Incubator site as
>> an example.
>>> 
>>> Regards,
>>> Dave
>>> 
>>>> On Apr 23, 2019, at 12:44 PM, Gavin McDonald 
>> wrote:
>>>> 
>>>> Hi All,
>>>> 
>>>> gulp and hugo should be installed on the websites jenkins node
>>>> shortly
>>>> 
>>>> HTH
>>>> 
>>>> Gav...
>>>> 
>>>> 
>>>> On Tue, Apr 23, 2019 at 8:38 PM Dave Fisher 
>> wrote:
>>>>> 
>>>>> Hi -
>>>>> 
>>>>> I am a mentor to Heron and am following this request. Does the
>>>>> website
>>>> Jenkins box have the software and if so, what version?
>>>>> 
>>>>> If there is a quick way to query that information that would be
>>>>> great! If
>>>> not then we figure a way to look.
>>>>> 
>>>>>> On Apr 23, 2019, at 11:50 AM, Josh Fischer 
>> wrote:
>>>>>> 
>>>>>> I am one of the committers on the incubating project Heron.  I am
>>>> looking
>>>>>> to create a Jenkins job that will be triggered on commit's to the
>>>>>> "asf-site" branch to build and deploy our static assets and I
>> have
>>>>>> some questions.
>>>>>> 
>>>>>> 1. Does the Jenkins box have the build tools listed below
>> already?
>>>>>> Or
>>>> do
>>>>>> you think it would be better if I downloaded and installed in the
>>>> workspace
>>>>>> for each build?
>>>>>> 
>>>>>> 2. Where would I put the static files to be served?  I'm assuming
>>>>>> there
>>>> is
>>>>>> something already pre-defined in the jenkins box that I can re-
>> use?
>>>>>> 
>>>>>> 
>>>>>> The requirements for building our site  are as follows: (I copied
>>>>>> our
>>>> setup
>>>>>> script directly  to make sure I didn't miss anything).  I hope
>>>>>> this is enough detail, please let me know.
>>>>>> 
>>>>>> A quick overview is:
>>>>>> 
>>>

Re: Fair use policy for build agents?

2019-08-25 Thread Dave Fisher
Hi -

Sent from my iPhone

> On Aug 23, 2019, at 11:06 AM, Allen Wittenauer 
>  wrote:
> 
> 
>> On Aug 23, 2019, at 9:44 AM, Gavin McDonald  wrote:
>> The issue is, and I have seen this multiple times over the last few weeks,
>> is that Hadoop pre-commit builds, HBase pre-commit, HBase nightly, HBase
>> flaky tests and similar are running on multiple nodes at the same time.
> 
>The precommit jobs are exercising potential patches/PRs… of course there 
> are going to be multiples running on different nodes simultaneously.  That’s 
> how CI systems work.

Peanut gallery comment.

Why was Hadoop invented in the first place? To take long running tests of new 
spam filtering algorithms and distribute to multiple computers taking tests 
from days to hours to minutes. I really think there needs to be a balance 
between simple integration tests and full integration.

Here is an example before an RC of Tika and POI are voted on 100,000s of 
documents are scanned and results are compared. The builds have simpler 
integration tests. Could the Hadoop ecosystem find a balance between precommit 
and daily integration? I know it is a messy situation and there is a trade off 
...

Regards,
Dave

> 
>> It
>> seems that one PR or 1 commit is triggering a job or jobs that split into
>> part jobs that run on multiple nodes.
> 
>Unless there is a misconfiguration (and I haven’t been directly involved 
> with Hadoop in a year+), that’s incorrect.  There is just that much traffic 
> on these big projects.  To put this in perspective, the last time I did some 
> analysis in March of this year, it works out to be ~10 new JIRAs with patches 
> attached for Hadoop _a day_.  (Assuming an equal distribution across the 
> year/month/week/day. Which of course isn’t true.  Weekdays are higher, 
> weekends lower.)  If there are multiple iterations on those 10, well….  and 
> then there are the PRs...
> 
>> Just yesterday I saw Hadoop and HBase
>> taking up nearly 45 of 50 H* nodes. Some of these jobs take many hours.
>> Some of these jobs that take many hours are triggered on a PR or a commit
>> that could be something as trivial as a typo. This is unacceptable.
> 
>The size of the Hadoop jobs is one of the reasons why Yahoo!/Oath gave the 
> ASF machine resources. (I guess that may have happened before you were part 
> of INFRA.)  Also, the job sizes for projects using Yetus are SIGNIFICANTLY 
> reduced: the full test suite is about 20 hours.  Big projects are just that, 
> big.
> 
>> HBase
>> in particular is a Hadoop related project and should be limiting its jobs
>> to Hadoop labelled nodes H0-H21, but they are running on any and all nodes.
> 
>Then you should take that up with the HBase project.
> 
>> It is all too familiar to see one job running on a dozen or more executors,
>> the build queue is now constantly in the hundreds, despite the fact we have
>> nearly 100 nodes. This must stop.
> 
>’nearly 100 nodes’: but how many of those are dedicated to specific 
> projects?  1/3 of them are just for Cassandra and Beam. 
> 
>Also, take a look at the input on the jobs rather than just looking at the 
> job names.
> 
>It’s probably also worth pointing out that since INFRA mucked with the 
> GitHub pull request builder settings, they’ve caused a stampeding herd 
> problem.  As soon as someone runs scan on the project, ALL of the PRs get 
> triggered at once regardless of if there has been an update to the PR or not. 
>  
> 
>> Meanwhile, Chris informs me his single job to deploy to Nexus has been
>> waiting in 3 days.
> 
>It sure sounds like Chris’ job is doing something weird though, given it 
> appears it is switching nodes and such mid-job based upon their description.  
> That’s just begging to starve.
> 
> ===
> 
>Also, looking at the queue this morning (~11AM EDT), a few observations:
> 
> * The ‘ubuntu’ queue is pretty busy while ‘hadoop’ has quite a few open slots.
> 
> * There are lots of jobs in the queue that don’t support multiple runs.  So 
> they are self-starving and the problem lies with the project, not the 
> infrastructure.
> 
> * A quick pass show that some of the jobs in the queue are tied to specific 
> nodes or have such a limited set of nodes as possible hosts that _of course_ 
> they are going to get starved out.  Again, a project-level problem.
> 
> * Just looking at the queue size is clearly not going to provide any real 
> data as what the problems are without also looking into why those jobs are in 
> the queue to begin with.



Re: Fair use policy for build agents?

2019-08-25 Thread Dave Fisher



Sent from my iPhone

> On Aug 25, 2019, at 8:23 PM, Allen Wittenauer 
>  wrote:
> 
> 
> 
>> On Aug 25, 2019, at 9:13 AM, Dave Fisher  wrote:
>> Why was Hadoop invented in the first place? To take long running tests of 
>> new spam filtering algorithms and distribute to multiple computers taking 
>> tests from days to hours to minutes.
> 
>Well, it was significantly more than that, but ok.

I guess I should have put weeks first ;-)
> 
>> I really think there needs to be a balance between simple integration tests 
>> and full integration.
> 
>You’re in luck!  That’s exactly what happens! Amongst other things, I’ll 
> be talking about how projects like Apache Hadoop, Apache HBase, and more use 
> Apache Yetus to do context sensitive testing at ACNA in a few weeks.

I’ll be there the whole time. I have an incubator talk on Tuesday morning. When 
is your talk?

Regards,
Dave


Re: A standard way to deploy an app to the VM

2019-08-26 Thread Dave Fisher
Hi -

Typically puppet is used and Infra can discuss the details.

It is better to ask this on users@infra or on the slack channel.

Regards,
Dave

Sent from my iPhone

> On Aug 26, 2019, at 7:09 AM, Lukasz Lenart  wrote:
> 
> Does anyone has an example how to do it? Or maybe a Jenkins build
> that's already is running such deployment?
> 
> 
> Thanks in advance
> -- 
> Łukasz
> + 48 606 323 122 http://www.lenart.org.pl/
> 
> pon., 10 cze 2019 o 20:50 Lukasz Lenart  napisał(a):
>> 
>> Hi,
>> 
>> Do you know if there is a standard way to deploy an app to the VM
>> hosted by ASF? Do you have an example of such a job?
>> https://issues.apache.org/jira/browse/INFRA-18522
>> 
>> 
>> Regards
>> --
>> Łukasz
>> + 48 606 323 122 http://www.lenart.org.pl/



Re: Heron Incubating Site CI Build

2019-10-21 Thread Dave Fisher
Hi Josh,

One piece of information might help get a response.

This is a static website build and you want to make sure that specific node has 
all the dependencies.

If you haven’t got help by Friday I might have a few cycles then to help.

Regards,
Dave

Sent from my iPhone

> On Oct 21, 2019, at 6:44 PM, Josh Fischer  wrote:
> 
> Hi,
> 
> Any chance someone could help me with this?
> 
>> On Fri, Oct 18, 2019 at 12:00 PM Josh Fischer  wrote:
>> 
>> Hi,
>> 
>> I am trying to get our CI process set up for our new static site.  I am
>> sending this email to understand what I need to do to make sure jenkins has
>> what it needs to successfully build the project.
>> Please let me know if you can get these packages/technologies installed in
>> the jenkins cluster and any other special instructions I will need to use
>> them in my scripts.
>> 
>> Thank you,
>> - Josh
>> 
>> We will need:
>> Node >= 8.x
>> Yarn >= 1.5.
>> Java 8
>> Bazel 0.26.0
>> Python 2.7
>> virtualenv
>> pip
>> And we will also need these binaries on the path:
>> automake cmake libtool-bin g++
>> python-setuptools python-dev python-wheel python python-pip unzip tree
>> openjdk-8-jdk virtualenv
>> 



Re: [CI] What are the troubles projects face with CI and Infra

2020-02-03 Thread Dave Fisher
Hi David,

Does the idea of having a branch that does the CI like ash-site help out in 
this situation.

If these workflows write into a branch that is always copied to and never is 
merged back then we would be good. It seems like we can track all “3rd party” 
commits in the gitbox and have a chance to see about the source of changes and 
flag anything questionable.

Regards,
Dave

> On Feb 3, 2020, at 6:37 PM, David Nalley  wrote:
> 
> Hi Alex,
> 
> So this was explored. It creates some problems - first double the
> administration overhead - most of that is automated, but it means that
> our API usage doubles, and we're already hitting limits from Github.
> 
> Second - at least one CI vendor thanked us for not doing that exactly
> - because the 'best' way to do it is to create an org per project or
> org per repo - and then the free tier is dedicated to that org. Except
> that's essentially abusing their free tier.
> 
> Finally - from a practical perspective, if everyone submits PRs and
> does testing against this apacheci org - that has become the de facto
> repo - it's where everyone is doing their work, and it makes
> provenance tracking.
> 
> As an aside - the mandate for no write access is not an infrastructure
> policy, it's a legal affairs requirement - we're merely implementing
> it.
> 
> --David
> 
> On Tue, Feb 4, 2020 at 3:24 AM Alex Harui  wrote:
>> 
>> Moving board@ to BCC.  Attempting to move discussion to builds@
>> 
>> I’m fine with the ASF maintaining its position on stricter provenance and 
>> therefore disallowing third-party write-access to repos.
>> 
>> A suggestion was made, if I understood it correctly, to create a whole other 
>> set of repos that could be written to by third-parties.  Would such a thing 
>> work?  Then a committer would have to manually bring commits back from that 
>> other set to the canonical repo.  That seems viable to me.
>> 
>> A concern was raised that the project might cut its release from the “other 
>> set”, but IMO, that would be ok if the release artifacts could be verified, 
>> which should be possible by comparing the canonical repo against the “other 
>> repo”, at least for the source package, and if there are reproducible 
>> binaries, for the binary artifacts as well.
>> 
>> Thoughts?
>> -Alex
>> 
>> From: Greg Stein 
>> Reply-To: "bo...@apache.org" 
>> Date: Monday, February 3, 2020 at 5:17 PM
>> To: "bo...@apache.org" 
>> Subject: Re: [CI] What are the troubles projects face with CI and Infra
>> 
>> On Mon, Feb 3, 2020 at 6:48 PM Alex Harui 
>> mailto:aha...@adobe.com>> wrote:
>>> ...
>> How does Google or other non-ASF open source projects manage the provenance 
>> tracking?
>> 
>> Note that most F/OSS projects don't worry about provenance to the level the 
>> Foundation worries. That affords them some flexibility that our choices do 
>> not allow. Those projects may also choose to trust tools with write access 
>> to their repositories, hoping they will not Do Something Bad(tm). We have 
>> chosen to not provide that trust.
>> 
>> IMO, I do not think the Foundation should relax its stance on provenance, 
>> nor trust in third parties ... but that is one of the key considerations 
>> [for the Board] at the heart of being able to leverage some third party 
>> CI/CD services.
>> 
>> Cheers,
>> -g
>> 



Re: Who has not migrated yet?

2020-08-14 Thread Dave Fisher
Hi Gavin,

I still need to migrate the Incubator JBake / custom site builds.

Would someone check and create an Incubator folder permissioned to the IPMC. We 
had a discussion on general@ a week or two ago.

I’ll look into this later today.

Regards,
Dave

Sent from my iPhone

> On Aug 14, 2020, at 1:33 AM, Gavin McDonald  wrote:
> 
> Hi All,
> 
> Tomorrow is the deadline for migrating to ci-builds.a.o and for builds.a.o
> to be turned off.
> 
> So, who has not migrated yet?
> If not, why not? What is holding you up?
> 
> If you need help, ask.
> 
> If you have many jobs to migrate - please check out the script [1] which
> can help you
> migrate all jobs in less than 5 minutes! (I know, I've tested it!)
> 
> Are there plugins missing you need ? (except ghprb)
> What else are you waiting for?
> 
> Are there outstanding tasks that Infra needs to do that might have been
> missed?
> 
> Lets see if we can get off by end of day tomorrow
> 
> [1] -
> https://cwiki.apache.org/confluence/display/INFRA/Migrating+Jenkins+jobs+from+Jenkins+to+Cloudbees
> 
> 
> -- 
> 
> *Gavin McDonald*
> Systems Administrator
> ASF Infrastructure Team



Re: Who has not migrated yet?

2020-08-14 Thread Dave Fisher
Hi Gavin,

Thank you very much.

I’m using Uwe’s script from  
https://cwiki.apache.org/confluence/display/INFRA/Migrating+jobs+from+Jenkins+to+Cloudbees

I’m looking for the API Tokens.

I asked on Slack as well.

Regards,
Dave

> On Aug 14, 2020, at 10:48 AM, Gavin McDonald  wrote:
> 
> Hi Dave,
> 
> 
> On Fri, Aug 14, 2020 at 12:42 PM Dave Fisher  wrote:
> 
>> Hi Gavin,
>> 
>> I still need to migrate the Incubator JBake / custom site builds.
>> 
>> Would someone check and create an Incubator folder permissioned to the
>> IPMC. We had a discussion on general@ a week or two ago.
>> 
> 
> Folder created, IPMC have access
> 
> 
>> 
>> I’ll look into this later today.
>> 
>> Regards,
>> Dave
>> 
>> Sent from my iPhone
>> 
>>> On Aug 14, 2020, at 1:33 AM, Gavin McDonald 
>> wrote:
>>> 
>>> Hi All,
>>> 
>>> Tomorrow is the deadline for migrating to ci-builds.a.o and for
>> builds.a.o
>>> to be turned off.
>>> 
>>> So, who has not migrated yet?
>>> If not, why not? What is holding you up?
>>> 
>>> If you need help, ask.
>>> 
>>> If you have many jobs to migrate - please check out the script [1] which
>>> can help you
>>> migrate all jobs in less than 5 minutes! (I know, I've tested it!)
>>> 
>>> Are there plugins missing you need ? (except ghprb)
>>> What else are you waiting for?
>>> 
>>> Are there outstanding tasks that Infra needs to do that might have been
>>> missed?
>>> 
>>> Lets see if we can get off by end of day tomorrow
>>> 
>>> [1] -
>>> 
>> https://cwiki.apache.org/confluence/display/INFRA/Migrating+Jenkins+jobs+from+Jenkins+to+Cloudbees
>>> 
>>> 
>>> --
>>> 
>>> *Gavin McDonald*
>>> Systems Administrator
>>> ASF Infrastructure Team
>> 
>> 
> 
> -- 
> 
> *Gavin McDonald*
> Systems Administrator
> ASF Infrastructure Team



Re: Controlling the images used for the builds/releases

2020-09-14 Thread Dave Fisher
Hi Jarek,

I’ve yet to read your Cwiki, but I am on the OpenOffice PMC.

(1) If you wish to discuss our build processes for Centos, WIndows, and macOS 
please email d...@openoffice.apache.org. We are working towards our 4.1.8 
release for the 20th Anniversary of Openoffice.org.

(2) If you wish to understand the many artifacts produced:

Source - https://dist.apache.org/repos/dist/release/openoffice/4.1.7/source/
SDK - https://dist.apache.org/repos/dist/release/openoffice/4.1.7/binaries/SDK/
User installation and language packs - 
https://dist.apache.org/repos/dist/release/openoffice/4.1.7/binaries/

There are currently 41 different languages in 4 linux flavors, 1 windows and 1 
macOS.

Total installation and language binaries are 41*2*(1+1+4) = 492 binaries x 4 = 
1968 files.

Note for macOS, we create dmg files, and for Windows Installer exe executables.

(3) Due to the huge size of all of our binaries OpenOffice is NOT distributed 
through the Apache Mirrors. Instead we are allowed to distribute through 
SourceForge.net

Regards,
Dave

> On Sep 14, 2020, at 10:14 AM, Jarek Potiuk  wrote:
> 
> Joan,
> 
> I read your comment and I have a kind request - hopefully you are not yet
> out - you mentioned in the comment Open Office and artifacts that would not
> fall into the criteria proposed. Could you please point us to one or two
> examples of such artifacts and someone that could carry the discussion -
> while you are away? I think I would like to understand what the problem is
> but it might be difficult to answer your doubts without having some
> specific examples that we can base our discussion on and someone who is at
> least a bit familiar with the matter.
> 
> J.
> 
> 
> On Mon, Sep 14, 2020 at 6:30 PM Jarek Potiuk 
> wrote:
> 
>> Very true Matt.
>> 
>> I think this is really a crucial part of the proposal to define the
>> boundary between the Apache / Non-Apache artifacts (potentially with a
>> different, non-ASF compliant license).
>> 
>> The "compiled" vs.  "packaged" that I proposed is one way of looking
>> at it, rather simple and straightforward to understand, verify, and
>> reason about. But I would love to hear other ideas - maybe some other
>> communities and OSS organizations approached it already and they came
>> up with some other ways of classifying it ?
>> 
>> One thing that is quite important here - we are not really talking
>> about "releases" and we should continue avoiding the name. I have no
>> doubt that proper release is .tar.gz signed and checksummed on
>> Apache's SVN containing sources and instructions on how to build the
>> software (including the convenience packages) using platforms and
>> tools available. There are no other "releases" by ASF, and I think
>> there should not be.
>> 
>> I keep on reminding it to myself when I proposed the changes, that
>> "convenience packages" are not "official" ASF software releases so I
>> think the policies there - however legal and "correct" do not have to
>> be that strict.
>> 
>> I am not a lawyer to grasp all the implications - so I am really
>> looking at the "crowd wisdom here" to understand all the consequences.
>> I think we will never get a 100% correct and "compilable" policy (so
>> to speak). My wife is a lawyer by education, so I know very well from
>> her that "law does not compile" (which was a bit surprising to an
>> engineer like me initially).
>> 
>> I think eventually - we will have to make some interpretations and
>> assumptions, and eventually, the ASF might have to take some risks
>> when reviewing and accepting such a proposal. But the risk-taking
>> should be very well informed in this case so I think we should gather
>> a lot of inputs and opinions on that.
>> 
>> J
>> 
>> 
>> On Mon, Sep 14, 2020 at 6:08 PM Matt Sicker  wrote:
>>> 
>>> From a distribution standpoint, the point of these policies to me has
>>> been to emphasize that anything we distribute here at Apache can be
>>> safely used and copied under the terms of the Apache License. As such,
>>> source releases have always been the target, though over time, Apache
>>> has accumulated several end-user type projects that may or may not
>>> have a developer audience that knows what to do with source code. The
>>> binary distributions become a useful channel for projects so that
>>> users can actually use the project without technical knowledge of
>>> development environment setups and such. This raises a conundrum,
>>> though, that nearly any non-trivial binary software artifact will
>>> contain or link to code that is not distributed under the Apache
>>> License, but it may be compatible (e.g., GPLv3 is compatible with
>>> ALv2, but combining the two results in GPLv3 basically, not
>>> ALv2+GPLv3; this doesn't change existing licenses of course). For our
>>> end users downloading Apache artifacts, we've had a history of
>>> publishing IP-safe source code that is easily used under the ALv2. I
>>> think the historical problem behind why binary artifacts haven'

Re: Controlling the images used for the builds/releases

2020-09-14 Thread Dave Fisher
Hi Jarek,

I’m sure that you have reviewed https://www.apache.org/legal/resolved.html

I think that you might want to focus on Class B licenses in these discussions.

It might help you to keep in a more limited scope and determine how to make 
compliant Helm Charts.

The legal committee and VP are the ones making decisions about what is 
compliant.

Regards,
Dave

> On Sep 14, 2020, at 9:30 AM, Jarek Potiuk  wrote:
> 
> Very true Matt.
> 
> I think this is really a crucial part of the proposal to define the
> boundary between the Apache / Non-Apache artifacts (potentially with a
> different, non-ASF compliant license).
> 
> The "compiled" vs.  "packaged" that I proposed is one way of looking
> at it, rather simple and straightforward to understand, verify, and
> reason about. But I would love to hear other ideas - maybe some other
> communities and OSS organizations approached it already and they came
> up with some other ways of classifying it ?
> 
> One thing that is quite important here - we are not really talking
> about "releases" and we should continue avoiding the name. I have no
> doubt that proper release is .tar.gz signed and checksummed on
> Apache's SVN containing sources and instructions on how to build the
> software (including the convenience packages) using platforms and
> tools available. There are no other "releases" by ASF, and I think
> there should not be.
> 
> I keep on reminding it to myself when I proposed the changes, that
> "convenience packages" are not "official" ASF software releases so I
> think the policies there - however legal and "correct" do not have to
> be that strict.
> 
> I am not a lawyer to grasp all the implications - so I am really
> looking at the "crowd wisdom here" to understand all the consequences.
> I think we will never get a 100% correct and "compilable" policy (so
> to speak). My wife is a lawyer by education, so I know very well from
> her that "law does not compile" (which was a bit surprising to an
> engineer like me initially).
> 
> I think eventually - we will have to make some interpretations and
> assumptions, and eventually, the ASF might have to take some risks
> when reviewing and accepting such a proposal. But the risk-taking
> should be very well informed in this case so I think we should gather
> a lot of inputs and opinions on that.
> 
> J
> 
> 
> On Mon, Sep 14, 2020 at 6:08 PM Matt Sicker  wrote:
>> 
>> From a distribution standpoint, the point of these policies to me has
>> been to emphasize that anything we distribute here at Apache can be
>> safely used and copied under the terms of the Apache License. As such,
>> source releases have always been the target, though over time, Apache
>> has accumulated several end-user type projects that may or may not
>> have a developer audience that knows what to do with source code. The
>> binary distributions become a useful channel for projects so that
>> users can actually use the project without technical knowledge of
>> development environment setups and such. This raises a conundrum,
>> though, that nearly any non-trivial binary software artifact will
>> contain or link to code that is not distributed under the Apache
>> License, but it may be compatible (e.g., GPLv3 is compatible with
>> ALv2, but combining the two results in GPLv3 basically, not
>> ALv2+GPLv3; this doesn't change existing licenses of course). For our
>> end users downloading Apache artifacts, we've had a history of
>> publishing IP-safe source code that is easily used under the ALv2. I
>> think the historical problem behind why binary artifacts haven't been
>> raised to the same status involves clarifying the line between where
>> our artifacts end and a third party's begin. This is especially
>> apparent in languages where the reference implementation runtime is
>> GPL (e.g., OpenJDK, though that itself has an interesting history due
>> to Apache Harmony having been a thing at one point).
>> 
>> From a security standpoint, distributing binaries requires more
>> infrastructural security to respond to potential malware infections,
>> CVEs in dependencies, etc.
>> 
>> 
>> On Mon, 14 Sep 2020 at 10:54, Jarek Potiuk  wrote:
>>> 
>>> Oh yeah. I start realizing now how herculean it is :). No worries, I am
>>> afraid when you are back, the discussion will be just warming up :).
>>> 
>>> Speaking of the "double standard" - the main reason really comes from
>>> licensing. When you compile something in that is GPL, your code starts to
>>> be bound by the licence. But when you just bundle it together in a software
>>> package - you are not.
>>> 
>>> So this is pretty much unavoidable to apply different rules to those
>>> situations. No matter what - we have to make this distinction IMHO. But
>>> let's see what others say on that.  I'd love to hear your thought on that,
>>> before you head out.
>>> 
>>> J
>>> 
>>> 
>>> On Mon, Sep 14, 2020 at 5:47 PM Joan Touzet  wrote:
>>> 
 Hi Jarek,
 
 I'm about to head out for 3 weeks, so

Re: Docker rate limits likely spell DOOM for any Apache project CI workflow relying on Docker Hub

2020-10-28 Thread Dave Fisher
Hi Joan,

I was vaguely concerned when I got that email.

Thank you for sounding the alarm and presenting a reasonable choice of actions.

The idea of an Apache Infra docker repos Is great! (Maybe dist.apache.org 
works?) A significant amount of time in Jenkins is spent building images which 
don’t change often. (Have we avoided proper understanding of the cache pattern?)

Best Regards and hope you are well,
Dave

Sent from my iPhone

> On Oct 28, 2020, at 9:02 PM, Joan Touzet  wrote:
> 
> Got your attention?
> 
> Here's what arrived in my inbox around 4 hours ago:
> 
>> You are receiving this email because of a policy change to Docker products 
>> and services you use. On Monday, November 2, 2020 at 9am Pacific Standard 
>> Time, Docker will begin enforcing rate limits on container pulls for 
>> Anonymous and Free users. Anonymous (unauthenticated) users will be limited 
>> to 100 container image pulls every six hours, and Free (authenticated) users 
>> will be limited to 200 container image pulls every six hours, when 
>> enforcement is fully implemented. 
> 
> Their referenced blog posts are here:
> 
> https://www.docker.com/blog/scaling-docker-to-serve-millions-more-developers-network-egress/
> 
> https://www.docker.com/blog/understanding-inner-loop-development-and-pull-rates/
> 
> Since I haven't seen this discussed on the builds list yet (and I'm not
> subscribed to users@infra), I wanted to make clear the impact. I would
> bet that just about every workflow using Jenkins, buildbot, GHA or
> otherwise uses uncredential-ed `docker pull` commands. If you're using
> the shared Apache CI workers, every pull you're making is counting
> towards this 100 pulls/6 hour limit. Multiply that by every ASF project
> on those servers, and multiply that again by the total number of PRs /
> change requetss / builds per project, and :(
> 
> Apache's going to hit these new limits real fast. And we must act fast
> to avoid problems, as those new limits kick in **MONDAY**.
> 
> Even for those of us lucky enough to have sponsorship for dedicated CI
> workers, it's still a problem. Infra has scripts to wipe all
> not-currently-in-use Docker containers off of each machine every 24
> hours (or did, last I looked). That means you can't rely on local
> caching. Other projects may also have added --force to their `docker
> pull` requests in their CI workflows, to work around issues with cached,
> corrupted downloads (a big problem for us on the shared CI
> infrastructure), or to work around issues with the :latest tag caching
> when it shouldn't.
> 
> This extends beyond projects using CI in the way Docker outlines on
> their second blog post linked above, namely their encouragement to use
> multi-stage builds. If local caching can't be relied on, there's no
> advantage. If what's being pulled down is an image containing that
> project's full build environment - this is what CouchDB does and I
> expect others do as well, as setting up our build environment, even
> automated, takes 30-45 minutes - frequent changes to the build
> dependencies require frequent pulls of those images, which cannot be
> mitigated via the Docker-recommended multi-stage builds.
> 
> =
> 
> Proposed solutions:
> 
> 1. Infra provides credentialed logins through the Docker Hub apache
> organisation to projects. Every project would have to update their
> Jenkins/buildbot/GHA/etc workflows to consume and use these credentials
> for every `docker pull` command. This depends on Apache actually being
> exempted for the new limits (I'm not sure, are we?) and those creds
> being distributed widely...which may run into Infra Policy issues.
> 
> 2. Infra provides their own Docker registry. Projects that need images
> can host them there. These will be automatically exempt. Infra will have
> to plan for sufficient storage (this will get big *fast*) and bandwidth
> (same). They will also have to firewall it off from non-Apache projects.
> 
> This should be configured as a pull through caching registry, so that
> attempts to `docker pull docker.apache.org/ubuntu:latest` will
> automatically reach out to hub.docker.com and store that image locally.
> Infra can populate this registry with credentials within the ASF Docker
> Hub org that are, hopefully, exempt from these requirements.
> 
> 3. Like #2, but per-project, on Infra-provided VMs. Today this is not
> practical, as the standard Infra-provided VM only has ~20GB of local
> storage. Just a handful of Docker images will eat that space nearly
> immediately.
> 
> ===
> 
> I think #2 above is the most logical and expedient, but it requires a
> commitment from Infra to make happen - and to get the message out - with
> only 4 days until DOOM.
> 
> What does the list think? More importantly, what does Infra think?
> 
> -Joan "I'm gonna sing The Doom Song now!" Touzet



Re: Hung website job

2020-11-02 Thread Dave Fisher
Hi -

I’ve been sledgehammering gitbox with OpenOffice.org migrations. I’m taking a 
break as I appear to have hit a JBake limit.

Jenkins has a 10 minute timeout git setup.

Regards,
Dave

Sent from my iPhone

> On Nov 2, 2020, at 1:28 PM, Chris Lambertus  wrote:
> 
> Try directing your git checkouts to github. There are some circumstances 
> when gitbox becomes overloaded, let's see if the problem goes away with 
> github.
> 
> 
> 
> 
> 
> 
>> On Nov 2, 2020, at 1:21 PM, Zoran Regvart  wrote:
>> 
>> Seems that this has happened again with this job:
>> 
>> https://ci-builds.apache.org/job/Lucene/job/Solr-reference-guide-master/404/
>> 
>> as in the previous case there seems to be a snag when checking out from git.
>> 
>> Two cases with the same symptoms, perhaps it's worth investigating?
>> 
>> zoran
>> 
>>> On Tue, Oct 27, 2020 at 2:51 PM Andrei Sekretenko  
>>> wrote:
>>> 
>>> Hi Zoran,
>>> sure, this particular job can be cancelled/retried.
>>> 
>>> From looking at the log, there seems to be some infrastructure issue on the 
>>> ASF Jenkins / Git plugin/system/... level:
>>> the build has yet to reach the build script, and looks stuck at the git 
>>> checkout by the SCM plugin.
>>> This might be worth investigating on the infra side (especially given that 
>>> Jenkins hadn't aborted the stuck job in 300 minutes; I'm not 100% sure if 
>>> Jenkins is supposed to abort a stuck job at the SCM checkout step, though).
>>> 
>>> On Tue, Oct 27, 2020 at 1:32 PM Zoran Regvart  wrote:
 
 Hi Mesos folk,
 perhaps someone from your side can take a look?
 
 thanks!
 
 zoran
 -- Forwarded message -
 From: Zoran Regvart 
 Date: Tue, Oct 27, 2020 at 11:49 AM
 Subject: Hung website job
 To: builds 
 
 
 Hi Builders,
 seems that a job has hung[1] on the website1 node preventing other
 website jobs to schedule.
 
 Can it be canceled/retried?
 
 zoran
 [1] https://ci-builds.apache.org/job/Mesos/job/Mesos-Websitebot/716/
 --
 Zoran Regvart
 
 
 --
 Zoran Regvart
>>> 
>>> 
>>> 
>>> --
>>> --
>>> Andrei Sekretenko
>> 
>> 
>> 
>> -- 
>> Zoran Regvart
> 



Mesos #1 has been building for 2 days 3 hours

2020-11-19 Thread Dave Fisher
Hi -

https://ci-builds.apache.org/job/Mesos/job/Mesos-Release/1/

Is this build stalled?

Regards,
Dave


Re: ASF Jenkins usability [Was: Re: GA again unreasonably slow (again)]

2021-01-10 Thread Dave Fisher
Jarek,

I would suggest you have a direct chat with Greg Stein.

Best Regards,
Dave

Sent from my iPhone

> On Jan 10, 2021, at 9:08 AM, Jarek Potiuk  wrote:
> 
> On Sun, Jan 10, 2021 at 5:28 PM Matt Sicker  wrote:
> 
>> If we can get GA to handle our use case properly, that would be awesome.
>> Being in the security engineering domain, though, I’m generally pessimistic
>> about security, so please excuse my cynicism.
>> 
> 
> I totally understand. i am a bit more optimistic, especially that we
> potentially could
> throw some heavyweight - like a number of Serious Apache project making a
> common and coordinated action of "We can either praise GitHub for their
> cooperation" or "They are not secure and not willing to improve".
> That is a publicity they would either love (the former) or hate (the
> latter).
> 
> I am happy to help in any way I can - represent INFRA in talks, describe the
> problems and propose solutions, word the "carrot" and "stick  opttions and
> even prepare how they could look like - I coudl take part in discussions
> with GitHub, maybe even escalate this to Microsoft if they will not show
> they are cooperating - butI have no legitimation for doing so.
> I have no power to throw the weight of ASF in the discussion. But I would
> love to do that and lead that if only I had this kind of power at least
> delegated
> to me and provided with the means of contacting GitHub and representing
> the ASF (but I doubt anyone would give me that power, it is a bit risky as
> with big power I would have no big responsibility.
> 
> Tough call - I am not sure how else I can help INFRA/ASF to help me and
> others.
> 
> J.
> 
> 
>> 
>>> On Sun, Jan 10, 2021 at 03:43 Jarek Potiuk 
>>> wrote:
>>> 
>>> I have a feeling (though I cannot know for sure)
>>> that you are underestimating the power of an organization like ASF in
>>> actually 'stating' their requirements and 'expectations' towards GitHub.
>>> 
>>> I am now an engineer, but I used to be CTO, CEO, Head of IT, Head of
>>> Technology
>>> and I know that a lot can be achieved by proper communication, stating
>> your
>>> expectations clearly and follow-up and pushing when you are dealing with
>>> partners like that - and engineering excellence or security perfection is
>>> not the only
>>> the thing that matters. Usability, maintenance, streamlining development
>>> matter and if you
>>> have "good enough security", they are more important for users.
>>> 
>>> I know if you look at it from an "infrastructure security Jenkins" point
>> of
>>> view - the Jenkins
>>> you manage is superior when it comes to security.
>>> This is perfectly clear, and I have no intention to question that or
>>> disagree with you.
>>> And yes - in this aspect I fully agree with you.
>>> 
>>> But there are other aspects which I see (and try to explain).
>>> While I deeply care about security (as probably you could see from my
>>> earlier
>>> communication). Just limiting the discussion to "who is more secure" is a
>>> terrible,
>>> terrible oversimplification.
>>> 
>>> I encourage you to exercise empathy and see it from the side I was
>>> explaining -
>>> maintenance, features, integration, streamlining development. Those are
>>> important
>>> things for developers. Less important for security engineers of course,
>> but
>>> if
>>> we can satisfy security, those are the things that matter.
>>> 
>>> I think currently we have mitigations for all the security problems we
>>> found at the project
>>> level. Also (as I mentioned before) we will have good leverage - via
>> social
>>> media pressure
>>> to push GA into solving those that are 'systemic' problems we found. They
>>> are not
>>> necessary for our project to solve, but it would simplify your life as
>> you
>>> take care of so
>>> many projects. So the security bounties that I opened are not for me -
>> they
>>> are for the
>>> ASF as a whole and for the security team of ASF. I exercised a lot of
>>> empathy to your
>>> team that rather than only solving my problem, I also spend time and
>> effort
>>> to push
>>> GA into solving it for all ASF projects and in the way that ASF infra
>>> security will be satisfied.
>>> I did not have to do that. Yet I try to think about your needs there.
>>> 
>>> And to be honest I expect something in return. Empathy and understanding
>>> other needs
>>> I have - performance, usability, streamlining development, minimum
>>> engineering effort
>>> to solve our problems is the least I can ask for. Help in dealing
>>> with GitHub and
>>> exercising ASF powers would be great.
>>> 
>>> Maybe with GitHub, the problem is that organizations like ASF do not
>>> exercise
>>> their leverage and do not clearly state what is essential for them while
>>> working with
>>> partners like them?
>>> 
>>> Did the ASF explicitly contacted GA and firmly stated that solving the
>>> problem of
>>> self-hosted runnines is an absolute top priority to solve our performance
>>> issues?
>>> 
>>> I do not know.
>>> 
>>> 

Re: GA again unreasonably slow (again)

2021-02-09 Thread Dave Fisher
The real hard problem is knowing when a change requires full regression and 
integration testing of all possible platforms.

I think projects are allowing lazy engineering if those making changes don’t 
know the level of testing needed for their changes.

Now with easy lightweight branches all being fully tested 

This is my 10,000 meter view.

But then I’m old school and on my first job the mainframe printout included how 
much the run I made was costing my boss in $.

Best Regards,
Dave

Sent from my iPhone

> On Feb 9, 2021, at 9:20 AM, Jarek Potiuk  wrote:
> 
> Absolutely agree Matt. Throwing more hardware at "all of the projects" is
> definitely not going to help - I was telling that from the beginning - it
> is like building free motorways - the more you build, the more traffic
> flows and the traffic jams remain. That's why I think reasonable
> self-hosted solution that every project owns (including getting the credits
> for that) is the only viable solution IMHO - only then you really start
> optimising stuff because you own both - the problem and the solution
> (and you do not - uncontrollably) impact other projects.
> 
> We've just opened-up  today the self-hosted solution in Airflow -
> announcement from Ash here:
> https://lists.apache.org/thread.html/r2e398f86479e4cbfca13c22e4499fb0becdbba20dd9d6d47e1ed30bd%40%3Cdev.airflow.apache.org%3E
> and we will be working out any "teething problems", once we are past that,
> 
> We are on our way to achieve the goal from the first paragraph - i.e. be
> able to control both problem and solution on a per-project basis. And once
> we get some learnings - I am sure we will share our solution and findings
> more widely with other projects, so that they could apply
> similar solutions. This is especially the missing "security piece"  which
> was a "blocker" so far, but also auto-scaling and tmpfs-optimisation
> results (which is a nice side-effect if we can get the 10x improvements in
> feedback time eventually (as it seems we can get there).
> 
> We love data @Airflow so we will gather some stats that everyone will be
> able to analyse and see how much they can gain from - not only the queue
> bottleneck removal but also improving the most important (in my opinion)
> metrics for the CI - which is feedback time. I personally think in CI there
> are are the only two important metrics: reliability and feedback time.
> Nothing else (including cost) matters. But If we get all three improved.
> that would be something that we will be happy other projects can also
> benefit from.
> 
> J.
> 
> 
> 
>> On Tue, Feb 9, 2021 at 3:16 PM Matt Sicker  wrote:
>> 
>> To be honest, this sounds exactly like the usual CI problem on every
>> platform. As your project scales up, CI becomes a Hard Problem. I don’t
>> think throwing hardware at it indefinitely works, though your research here
>> is finding most of the useful things.
>> 
>>> On Tue, Feb 9, 2021 at 02:21 Jarek Potiuk  wrote:
>>> 
>>> The report shows only top contenders. And yes - we know it is flawed -
>>> because it shows workflows not jobs (if you read the disclaimers - we
>>> simply have not enough API calls quota to get detailed information for
>> all
>>> projects).
>>> 
>>> So this is anecdotal. I also get no queue when I submit PR at 11 pm.
>>> Actually whole Airflow committer team had to switch to the "night shift"
>>> because of that. And the most "traffic-heavy" projects - Spark, Pulsar,
>>> Superset, Beam, Airflow -  I think some of the top "traffic" projects
>>> experience the same issues and several hours queue when they run during
>> the
>>> EMEA day/US morning.  And we all together try to help each other (for
>>> example I helped yesterday the Pulsar team to implement most aggressive
>> way
>>> of cancelling their workflows https://github.com/apache/pulsar/pull/9503
>>> (you can find pretty good explanation why and how it was implemented this
>>> way), also we are working together with the Pulsar team to optimize their
>>> workflow - there is a document
>>> 
>>> 
>> https://docs.google.com/document/d/1FNEWD3COdnNGMiryO9qBUW_83qtzAhqjDI5wwmPD-YE/edit
>>> where several peopel are adding their suggestions (including myself based
>>> on Airflow experiences).
>>> 
>>> And with yetus' 12 (!)  wokflow runs over the last 2 monhts (!)
>>> https://pasteboard.co/JNwGLiR.png - indeed you have a high chance you
>> have
>>> not experienced it, especially that you are the only person committing
>>> there. This is hardly representative for other projects that have 100s of
>>> committers and 100s of PRs a day. I am not sure if you are aware of
>>> that, but those are the most valuable projects for the ASF - as those are
>>> the ones that actually build community (Folowing "comunity over code
>>> motto). If you have 3 PRs in 3 months and there aare 200 other projects
>>> using GA, I think yetus is not going to show up in any meaningful
>>> statistics.
>>> 
>>> I am not sure if drawing a conclusion from a project that

Pelican Builds

2021-04-28 Thread Dave Fisher
Hi Kirs,

I’m working on Pelican builds myself and I noticed you have failing builds on 
the buildbot.

For instance: https://ci2.apache.org/#/builders/3/builds/2309

You are missing configuration from your pelicanconf.py file specifically the 
PATH.

See https://docs.getpelican.com/en/latest/settings.html#basic-settings

Regards,
Dave

Re: Pelican Builds

2021-04-28 Thread Dave Fisher
Hi Daniel,

> On Apr 28, 2021, at 7:52 AM, Daniel Gruno  wrote:
> 
> 
> 
> On 2021/04/28 14:26:08, Dave Fisher  wrote: 
>> Hi Kirs,
>> 
>> I’m working on Pelican builds myself and I noticed you have failing builds 
>> on the buildbot.
> 
> dolphinscheduler-website is a docsite repo, not pelican. Their .asf.yaml is 
> straight-up wrong. It should not have the pelican section at all.

pelican-build.py should reject builds w/o pelicanconf.py

Regards,
Dave

> 
>> 
>> For instance: https://ci2.apache.org/#/builders/3/builds/2309
>> 
>> You are missing configuration from your pelicanconf.py file specifically the 
>> PATH.
>> 
>> See https://docs.getpelican.com/en/latest/settings.html#basic-settings
>> 
>> Regards,
>> Dave



Re: Dependabot-like solution for Apache projects

2021-09-03 Thread Dave Fisher
I have similar thoughts about branches and a PR.

If dependabot were allowed then a FAQ page on infra.apache.org 
 would be useful regarding configuration and  
required review of the updated dependents. Such discussion would be focused on 
issues such as looking for license changes and assuring that no supply side 
security issues have occurred.

Regards,
Dave

> On Sep 3, 2021, at 9:20 AM, Christopher  wrote:
> 
> I feel like people are getting a bit hung up on the fact that dependabot
> creates branches in the repo directly when that isn't any different from
> what GitHub is doing for pull requests.
> 
> Dependabot creates refs in git under refs/heads/dependabot/* (this is
> customizable, to some extent by the repo owners with a config file)
> 
> GitHub natively creates refs for pull requests under refs/pull/*
> 
> Both write directly to the repo. Neither is a problem. The only difference
> is that we mirror some refs and not others by default, so we forget about
> the PR refs. It should be noted that dependabot's refs (called branches,
> only because the refs are under refs/heads/*) are trivial config changes,
> and not really code and definitely don't contain any IP whose provenance
> matters. The utility as branches rather than as non-branch refs like pull
> requests is to provide a writable place for the devs to add commits to
> resolve any actual code changes to make the version bumps function before
> the PR is merged.
> 
> I really don't see the problem here, but if we really don't want to mirror
> these refs, like PR refs, we could just exclude them in the mirror process,
> although I think that would cause problems if devs are adding commits to
> these branches to resolve issues before merging. Either these branches
> contain trivial changes, or they contain dev additions. So either way, I
> don't see a problem mirroring them. But it could be done if policy
> disallowed it.
> 
> Personally, I think it should be allowed, as the branches are not
> substantially different from PR refs, do not contain substantial IP unless
> the devs add commits, are transient, and the tool is immensely useful.
> Since the tool is managed by GitHub itself and not another third party, and
> we have a relationship with GitHub as a hosting service for mirroring our
> repos, I think the risk is further mitigated. We already trust GitHub to
> write PR refs to the repos, and to function properly as a bidirectional
> mirror, so why not trust them for this immensely useful tool that empowers
> projects to develop more secure builds?
> 
> I think we shouldn't let nitpicking over policy technicalities get in the
> way of doing what is reasonable here. If we really don't trust these refs,
> just exclude them from the mirror process to the gitbox copy. But I see no
> reason we can't trust devs to enable this tool and decide for their own
> workflows.
> 
> On Fri, Sep 3, 2021, 06:15 sebb  wrote:
> 
>> On Fri, 3 Sept 2021 at 01:09, Olivier Lamy  wrote:
>>> 
>>> On Fri, 3 Sept 2021 at 09:57, David Jencks 
>> wrote:
>>> 
 I’m afraid I don’t understand your “the result is the same” argument.
 
>>> 
>>> result == Apache committer merging the bot commit
>>> 
>> 
>> But that is not the only change to the repo.
>> The repo also has a branch containing code committed by the 3rd party.
>> 
>> We cannot allow 3rd parties to add code directly to our repos.
>> 
 
 Let's say a company has 2 employees, Arthur, who is not an Apache
 committer on project X, and Bernadette who is.  Arthur writes some
>> code and
 submits a PR to project X.  In scenario 1, Bernadette merges the PR
>> and in
 scenario 2 Arthur does.  The result is the same!! (at least the
>> resulting
 code is the same, there will be some difference in the fields in the
 commit) So should we allow scenario 2?
 
>>> 
>>> except in our case Arthur (i.e the bot never merge his pr but only Apache
>>> committer merge to master/main branches)
>>> 
>>> 
>>> 
 
 David Jencks
 
> On Sep 2, 2021, at 4:42 PM, Olivier Lamy  wrote:
> 
> I perfectly understand this.
> But my point was at the end the result is the same!
> If we follow such reasoning, why do we use github as we do not
>> control
 what
> is happening there?
> but yeah I'm having an already lost discussion :)
> 
> On Fri, 3 Sept 2021 at 09:32, David Jencks >> 
 wrote:
> 
>> The difference is whether a non-committer has write access to an
>> Apache
>> repo.  In this case the non-committer is some code GitHub maintains
 that we
>> have no control over.  Why should we trust it not to modify a real
 branch?
>> 
>> To now argue on the other side of the issue, the git website
>> publishing
>> workflow using .asf.yaml allows Jenkins jobs to automatically
>> commit to
>> specific branches in Apache repos as part of publishing websites.  I
 can’t
>> say I’m all that clear o

Re: Persistent cache for Buildbot Pelican builds

2024-02-13 Thread Dave Fisher
One way that could work is to put a cache into a branch of the repository …

It could double as a data cache.

> On Feb 13, 2024, at 6:49 AM, sebb  wrote:
> 
> Is there any way to pass data between runs of Buildbot Pelican builds?
> 
> This would be useful for caching data that has to be fetched from
> elsewhere, in case the remote site is down.
> 
> Or for data that is expensive to generate but changes infrequently.
> 
> Sebb



Re: Persistent cache for Buildbot Pelican builds

2024-02-13 Thread Dave Fisher



> On Feb 13, 2024, at 2:49 PM, sebb  wrote:
> 
> On Tue, 13 Feb 2024 at 20:21, Dave Fisher  wrote:
>> 
>> One way that could work is to put a cache into a branch of the repository …
>> 
>> It could double as a data cache.
> 
> Yes, but that would require updating the repo.
> I was hoping to avoid that, as it is a wasteful use of a repository,
> as well as requiring credentials to do the update.

True. You would need storage that can be shared between buildbot vms and then 
give repositories a cache directory.

Regards,
Dave


> 
>>> On Feb 13, 2024, at 6:49 AM, sebb  wrote:
>>> 
>>> Is there any way to pass data between runs of Buildbot Pelican builds?
>>> 
>>> This would be useful for caching data that has to be fetched from
>>> elsewhere, in case the remote site is down.
>>> 
>>> Or for data that is expensive to generate but changes infrequently.
>>> 
>>> Sebb
>> 



Re: Persistent cache for Buildbot Pelican builds

2024-02-14 Thread Dave Fisher
There are a few dependencies on various data files. For example www-site does 
depend on the incubator's podlings.xml which is a hand edited file managed by 
the IPMC. This gets broken from time to time.

> On Feb 14, 2024, at 12:56 AM, sebb  wrote:
> 
> The most recent example was build failures in www-site when Whimsy was
> unavailable.
> 
> I thought it would be simple to implement a cache, given how easy it
> is to do it so on some other build systems (e.g. the comdev-site
> Jenkins job which runs on Cloudbees).
> 
> On Wed, 14 Feb 2024 at 08:35, Volkan Yazıcı  wrote:
>> 
>> Sebb, what is the real world problem we are trying to address? Do we have
>> Pelican builds taking too much time causing trouble for certain projects?
>> Do we have expensive network bills due to frequent downloads?
>> 
>> On Tue, Feb 13, 2024 at 3:49 PM sebb  wrote:
>> 
>>> Is there any way to pass data between runs of Buildbot Pelican builds?
>>> 
>>> This would be useful for caching data that has to be fetched from
>>> elsewhere, in case the remote site is down.
>>> 
>>> Or for data that is expensive to generate but changes infrequently.
>>> 
>>> Sebb
>>> 



Re: Using Github Actions Trusted Publisher for PyPI releases ?

2024-06-25 Thread Dave Fisher



> On Jun 25, 2024, at 12:54 AM, Jarek Potiuk  wrote:
> 
> FYI. I created a proposal [1] in Airflow to switch to the Trusted Publisher
> workflow and providing consensus, we will implement it.

This is cool.
> 
> BTW. Interesting finding - I found out while reading docs, that it's
> recommended to use separate "Github Actions Environment" for such trusted
> publishing workflows, which has the added benefit that you can set up to 6
> reviewers of all the workflows run in such an environment, which removes
> the need of manually adding list of "release managers" who are allowed to
> upload packages to PyPI - and it needs "another reviewer" from the list of
> 6 to approve such upload workflow - which is a nice security feature I have
> not expected.

A quick read includes that these reviewers can include teams which I interpret 
as it can easily be the whole of the PMC who have linked to a GitHub account.

> 
> I will see how much of that will be reusable - I will also see if there are
> APIs that we can modify "self-serve" to allow self-management of such
> environment configuration, and in case there is, we will contribute it
> along the way to make it easy for other projects (I've already contributed
> small feature there, so I already know how to do it).

I wonder if a future enhancement would be to use an API to connect to the ADP 
to confirm that releases to PyPi (and other distribution platforms) has passed 
the PMC’s release VOTE!

Best,
Dave

> 
> [1] Airflow Proposal to switch to Trusted Publishing via Github Actions
> https://lists.apache.org/thread/t9l91nd4196n9mwsthhnx3qckcj45sxo
> [2] Github Actions Environments:
> https://docs.github.com/en/actions/deployment/targeting-different-environments/using-environments-for-deployment
> 
> 
> J.
> 
> On Thu, Jun 20, 2024 at 10:05 PM Jarek Potiuk  wrote:
> 
>> I have not planned to write an action, I thought more of bash/python to
>> pull the artifacts and use existing official action for publishing, but
>> yeah - good idea - I might package that into reusable action that we could
>> use for other projects. Might be generalisable.
>> 
>> 
>> 
>> On Thu, Jun 20, 2024 at 7:52 PM Greg Stein  wrote:
>> 
>>> Hey Jarek ... note that we have an infrastructure-actions repository for
>>> "official ASF" GH Actions. If you agree with that approach, then you can
>>> dev/test there or we can move your tested Action there when you're ready
>>> to
>>> share it with others.
>>> 
>>> Cheers,
>>> Greg
>>> InfraAdmin, ASF
>>> 
>>> 
>>> On Thu, Jun 20, 2024 at 7:10 AM Jarek Potiuk  wrote:
>>> 
 Unless I hear otherwise, I **assume** there are no big reasons against
 this. My plan is that I will add a Github Action (manually triggered,
 limited to release managers only) which will NOT build the packages,
>>> but it
 will download them from `downloads.apache.org` (or dist.apache.org for
>>> RC
 packages) and publish them to PyPI. This should be really "safe" and
>>> will
 remove the needs for us to keep local pypi keys to upload the packages.
 
 This will require repo reconfiguration, so I will have to - likely -
>>> open a
 JIRA ticket to INFRA - once I do it, I will be happy to describe the
>>> steps
 for all other projects that upload packages to PyPI and use GitHub.
 
 Does that make sense?
 
 J.
 
 
 On Fri, Jun 14, 2024 at 12:14 PM Jarek Potiuk  wrote:
 
> 
>> My only question is what do the users see in terms of the verified
>> identity that performed the release. Does it still appear to have
>>> come
>> from the individual maintainer? The ASF? Somewhere else? I'd only be
>> concerned if the answer was "somewhere else".
>> 
> 
> Currently users do not see anything. There was a discussion on
>>> Python's
> discord about exposing Trusted Published information in PyPI
> 
 
>>> https://discuss.python.org/t/pre-pep-exposing-trusted-publisher-provenance-on-pypi/42337
> as a "pre-PEP discussion". This resulted in Draft PEP 740 -
> 
 
>>> https://discuss.python.org/t/pep-740-index-support-for-digital-attestations/44498
> - where you will be able to upload multiple attestations when you
>>> publish
> your packages. So the thinking is that you can have multiple
>>> attestations
> of provenance of your package when you upload it to PyPI and a trusted
> publisher will be just one of them. So in our case we could also add
>>> our
> own signatures when we publish., This is still draft and we will have
>>> a
> chance of influencing the direction, I am sure. Generally Michael and
>>> the
> whole security team are on the spree of onboarding more and more
>>> projects
> to use trusted publishers and they are planning to discuss and
 implemented
> more security/provenance features when they reach critical mass (from
>>> the
> discussions I had - I believe they are doing very well there - and
 having a
> stories