from:"Joan Touzet"

Re: [JENKINS] - Plugins update

2017-09-16 Thread Joan Touzet

Oof, I see that ours (a multibranch pipeline job) is missing. Any clue
whether the job will be restored once the plugins are restored as well?

-Joan
- Original Message -
From: "Gavin McDonald" 
To: builds@apache.org
Sent: Saturday, 16 September, 2017 6:36:21 PM
Subject: Re: [JENKINS] - Plugins update

Another update

Some plugins have failed to load since they were upgraded, even though marked 
as compartible and only minor increment 1.20 to 1.21 etc 

I am looking into re-installing those today.

In the meantime, it looks like some jobs have disappeared as a result, others 
not building etc …

Gav…

> On 16 Sep 2017, at 10:03 pm, Gavin McDonald  wrote:
> 
> 
>> On 16 Sep 2017, at 2:08 pm, Gavin McDonald  wrote:
>> 
>> Hi All,
>> 
>> Jenkins is scheduled for a restart today to upgrade some plugins.
>> Currently therefore no more jobs will run until current ones are completed 
>> and the restart has occurred.
>> 
>> Will update again after the restart.
> 
> Ok sorry for delay, a couple of restarts were necessary.
> 
> List of plugins upgraded today are listed at:
> 
> https://cwiki.apache.org/confluence/display/INFRA/Plugin+Updates 
>  
> 
> Gav…
> 
>> 
>> Gav…
>> 
>

Re: [JENKINS] - Plugins update

2017-09-16 Thread Joan Touzet

Yup, that's the one. I see the job is back now. Phew! Going
to give it a run and see if it's successful.

-Joan

- Original Message -
From: "Gavin McDonald" 
To: builds@apache.org
Cc: "Joan Touzet" 
Sent: Saturday, 16 September, 2017 8:04:39 PM
Subject: Re: [JENKINS] - Plugins update

Is this it ? 

https://builds.apache.org/view/All/job/CouchDB/ 

Gav… 

On 17 Sep 2017, at 9:52 am, Gavin McDonald < ga...@16degrees.com.au > wrote: 

So that I can look out for it, which job(s) are missing? 

Gav… 

On 17 Sep 2017, at 9:37 am, Gav < ipv6g...@gmail.com > wrote: 

Hi Joan, 

not at this stage. I'm 'hoping' they will return, if not I'll look to 
backups. 

I am finding some errors with plugins such as :- 

Failure - 

java.io.IOException: Pipeline: Job v2.14.1 failed to load. 
- You must update Jenkins from v2.60.1 to v2.62 or later to run this plugin. 

I have no clue why Jenkins did not warn about this or why it let me 
install it anyway. 

That plugin could be related as to why your job is missing. 

Also, I just checked my screenshots before starting plugin updates and 
this plugin is not even listed 

so must have been pulled in as a dependency of another. :/ 

More news later, for now, Im fighting plugins one at a time. 

I may end up upgrading Jenkins itself to latest LTS - which was not 
really scheduled until EOM 

but we'll see how I get on. 

Gav... 

On Sun, Sep 17, 2017 at 8:50 AM, Joan Touzet < woh...@apache.org > wrote: 

Oof, I see that ours (a multibranch pipeline job) is missing. Any clue 
whether the job will be restored once the plugins are restored as well? 

-Joan 
- Original Message - 
From: "Gavin McDonald" < ga...@16degrees.com.au > 
To: builds@apache.org 
Sent: Saturday, 16 September, 2017 6:36:21 PM 
Subject: Re: [JENKINS] - Plugins update 

Another update 

Some plugins have failed to load since they were upgraded, even though 
marked as compartible and only minor increment 1.20 to 1.21 etc 

I am looking into re-installing those today. 

In the meantime, it looks like some jobs have disappeared as a result, 
others not building etc … 

Gav… 

On 16 Sep 2017, at 10:03 pm, Gavin McDonald < ga...@16degrees.com.au > 
wrote: 

On 16 Sep 2017, at 2:08 pm, Gavin McDonald < ga...@16degrees.com.au > 
wrote: 

Hi All, 

Jenkins is scheduled for a restart today to upgrade some plugins. 
Currently therefore no more jobs will run until current ones are 
completed 

and the restart has occurred. 

Will update again after the restart. 

Ok sorry for delay, a couple of restarts were necessary. 

List of plugins upgraded today are listed at: 

https://cwiki.apache.org/confluence/display/INFRA/Plugin+Updates < 
https://cwiki.apache.org/confluence/display/INFRA/Plugin+Updates > 

Gav… 

Gav… 

-- 
Gav...

Re: (MXNet) Testing changes to the Jenkinsfile without merging to the repo.

2017-09-19 Thread Joan Touzet

Multibranch pipeline builds always use the Jenkinsfile in a specific branch
when testing that branch. So, your changes to the Jenkinsfile pre-merge will
only affect builds on that branch. Hopefully this is enough?

You can always create a new multibranch pipeline build that has a filter on
which branches it will build on, and have that job use the new Jenkinsfile.
You can then test it out on any branch prefixed with jenkins- (for example).

-Joan
- Original Message -
From: "Meghna Baijal" 
To: builds@apache.org
Sent: Tuesday, 19 September, 2017 4:45:51 PM
Subject: (MXNet) Testing changes to the Jenkinsfile without merging to the repo.

Hi All, 
I want to make some changes to the way PR builds are triggered on MXNet as 
discussed in another email thread ("Run PR builds on Apache Jenkins only after 
the commit is reviewed”)

Currently, there is a “Multibranch Pipeline" job in Jenkins that is responsible 
for these MXNet builds. This job uses the jenkinsfile located in the source 
code to build the project. 
In order to make changes to the PR build triggers, I need to add some 
properties to the Jenkinsfile script (as indicated here - 
https://jenkins.io/doc/pipeline/steps/workflow-multibranch/#properties-set-job-properties
 
)

My question is - How do I test these changes without merging them? Any untested 
experiments with this jenkinsfile might lead to all the MXNet builds breaking. 
Would it be ok to create a separate job in Apache Jenkins that points to a 
private repo? Or is there a better way that I am unaware of?

Here is what I want to change and test - 
1. Add a property to the Jenkinsfile so that if a new build is  started for a 
PR, any running or queued builds for the *same* PR are terminated.
2. Trigger PR builds based on comments in Github. 

Any help or re-direction would be appreciated!

Thanks,
Meghna Baijal

Re: INFRA-15156: problems to send mails from Jenkins

2017-10-02 Thread Joan Touzet

CouchDB is still getting mails, so it's not a server-wide outage.

-Joan
- Original Message -
From: "P. Ottlinger" 
To: builds@apache.org
Sent: Monday, 2 October, 2017 7:00:53 PM
Subject: Re: INFRA-15156: problems to send mails from Jenkins

Hi,

ping  help -
can anyone help out since Tamaya does still not get any mails
(even after Gav's latest updates).

Thanks,
Phil

Am 22.09.2017 um 23:26 schrieb P. Ottlinger:
> Hi,
> 
> is there a general problem with mails being sent from Jenkins?
> 
> Any other projects that have these problems?
> No mails are sent out for our jobs:
> https://builds.apache.org/view/S-Z/view/Tamaya/
> 
> I've filed
> https://issues.apache.org/jira/browse/INFRA-15156
> for that.
> 
> Is the problem related to the latest updates?
> 
> Thanks,
> Phil
>

Re: Nightly build & push of Docker image(s) to bintray?

2017-10-20 Thread Joan Touzet

Be very careful here. Making such binaries available outside of your
immediate developer community is a violation of Apache policy:

  http://www.apache.org/dev/release-distribution.html#unreleased

The middle two bullet points are the issue here.

CouchDB decided to build packages and binaries off of master and
post them to a URL that is only shared on request to developers of
the project. As for Docker, especially because we now occupy the
apache/couchdb namespace, we only push official releases there, and
only publish a Dockerfile people can use for building off of master.
We could conceivably have our own Docker registry for master builds,
I guess, but the demand hasn't been there.

Now that the scary warning is over ;) bintray's API docs are pretty
clear, and it shouldn't be too hard to auto-push to it if you need to.

-Joan

- Original Message -
From: "Kenneth Knowles" 
To: builds@apache.org
Sent: Friday, 20 October, 2017 4:38:28 PM
Subject: Nightly build & push of Docker image(s) to bintray?

Hello from the Beam project,

We have a Jenkins build that pushes a snapshot jar every 24 hours (if tests
pass) and we would like to have the same for docker images that form a core
part of the Beam project.

Can you advise on the best way to go about this?

I am aware of the use of bintray for Docker repository and my own account
is hooked up, but this would be a CI thing. Any pointers to docs or other
projects doing similar things would be super, too, and I am of course happy
to do my homework if you can offer a pointer.

Kenn

Re: Jenkins slave able to build BED & RPM

2017-10-30 Thread Joan Touzet

This is what Apache CouchDB does to auto-build .deb and .rpm Linux
packages.

All the details are in the Jenkinsfile in our main repo, along with
the companion couchdb-ci and couchdb-pkg repos for support files.

Start here:

https://github.com/apache/couchdb/blob/master/Jenkinsfile#L100-L112

It took a bit of work to get the containers set up correctly, and
we worked hard with Infra to get the worker nodes running in a
stable fashion, but it hums along with minimal intervention now.

-Joan
- Original Message -
From: "Allen Wittenauer" 
To: builds@apache.org
Sent: Monday, 30 October, 2017 1:58:18 PM
Subject: Re: Jenkins slave able to build BED & RPM

> On Oct 30, 2017, at 4:33 AM, Dominik Psenner  wrote:
> 
> 
> On 2017-10-30 11:57, Thomas Bouron wrote:
>> Thanks for the reply and links. Went already to [1] but it wasn't clear to 
>> me what distro each node was (unless going through every one of them but... 
>> there are a lot) As you said, it seems there isn't a centos or Red Hat 
>> slave, I'll file a request to INFRA for this then.
> 
> You also have the option to run the build with docker on ubuntu using a 
> centos docker image. I think it would be wise to evaluate that option before 
> filing a request to INFRA. The great benefit is that you can build an rpm and 
> test a built rpm on all the rhel flavored docker images that you would like 
> to support without the requirement to add additional operating systems or 
> hardware to the zoo of build slaves.

+1

Despite the issues[*], I’m looking forward to a day when INFRA brings 
the hammer down and requires everyone to use Docker on the Linux machines.  
I’ve spent the past week looking at why the Jenkins bits have become so 
unstable on the ‘Hadoop’ nodes.  One thing that is obvious is that the jobs 
running in containers are way easier to manage from the outside.  They don’t 
leave processes hanging about and provides enough hooks to make sure jobs are 
getting a ‘fair share’ of the node’s resources. Bad actor? Kill the entire 
container. Bam, gone. That’s before even removing the need to ask for software 
to be installed. [No need for 900 different versions of Java installed if 
everyone manages their own…]

* - mainly, disk space management and docker-compose creating a complete mess 
of things.

Re: Nightly build & push of Docker image(s) to bintray?

2017-11-13 Thread Joan Touzet

See https://issues.apache.org/jira/browse/INFRA-13671 where Gavin
was not particularly pleased with the idea of using bintray for snapshots.

We ended up running our on VM (couchdb-vm2) to host binary snapshots
for ourselves; integrating that into the build process was pretty easy
(scp + creds in Jenkins). We expose the nightlies with a simple nginx
setup with directory indexing enabled.

-Joan
- Original Message -
From: "Kenneth Knowles" 
To: builds@apache.org
Sent: Monday, 13 November, 2017 1:52:48 PM
Subject: Re: Nightly build & push of Docker image(s) to bintray?

Following up here -

Have you (or anyone else) set up a bintray "role" account (or some such)
specifically for the purpose of having Jenkins push to a non-release
repository?

We have a place to push, we just need Jenkins to be allowed to do so. I
know very little about how Jenkins is allowed to push to Nexus, but I
assume it should be roughly analogous so the security scenario is no worse.

Context is https://issues.apache.org/jira/browse/INFRA-15382 which is
moving right along.

Kenn

On Fri, Oct 20, 2017 at 2:22 PM, Kenneth Knowles  wrote:

> Thanks for the heads up!
>
> When it comes to jars, I believe what we do is the usual and allowed
> process for release candidates / nightly test jars. They go to the snapshot
> repository [1], per the Release Policy [2] and ASF Jar FAQ [3]. So when it
> comes to jars, I am fairly sure what we are doing is OK.
>
> We have established the apache/beam-docker [4] repository for our official
> releases (there are no releases of this feature yet). So I was hoping to
> have an approved place to push release candidates / nightly test images.
> These are exactly analogous to the jars above, and are somewhat necessary
> companion artifacts.
>
> The answer could be as simple as "create apache/beam-docker-snapshots and
> get credentials onto your Jenkins workers", but TBH I'm not sure how
> Jenkins has credentials to push jars with `mvn deploy` nor how best to set
> up something analogous for `docker push`. But I'm sure this is something we
> could just figure out, once we know where we want to push.
>
> The answer could also be that we are on our own and we just need to follow
> the distribution guidelines. That is currently our status, and developers
> have to push their own containers to their own docker repos and pass magic
> command line flags for testing. But ideally there would be a nicer default
> for developers who were not touching these docker images, which are only a
> subcomponent of our project.
>
> Kenn
>
> [1] https://repository.apache.org/content/groups/snapshots/org/apache/
> [2] http://www.apache.org/legal/release-policy.html#host-rc
> [3] https://www.apache.org/dev/repository-faq.html#revolutioncode
> [4] https://bintray.com/apache/beam-docker
>
> On Fri, Oct 20, 2017 at 1:49 PM, Joan Touzet  wrote:
>
>> Be very careful here. Making such binaries available outside of your
>> immediate developer community is a violation of Apache policy:
>>
>>   http://www.apache.org/dev/release-distribution.html#unreleased
>>
>> The middle two bullet points are the issue here.
>>
>> CouchDB decided to build packages and binaries off of master and
>> post them to a URL that is only shared on request to developers of
>> the project. As for Docker, especially because we now occupy the
>> apache/couchdb namespace, we only push official releases there, and
>> only publish a Dockerfile people can use for building off of master.
>> We could conceivably have our own Docker registry for master builds,
>> I guess, but the demand hasn't been there.
>>
>> Now that the scary warning is over ;) bintray's API docs are pretty
>> clear, and it shouldn't be too hard to auto-push to it if you need to.
>>
>> -Joan
>>
>> - Original Message -
>> From: "Kenneth Knowles" 
>> To: builds@apache.org
>> Sent: Friday, 20 October, 2017 4:38:28 PM
>> Subject: Nightly build & push of Docker image(s) to bintray?
>>
>> Hello from the Beam project,
>>
>> We have a Jenkins build that pushes a snapshot jar every 24 hours (if
>> tests
>> pass) and we would like to have the same for docker images that form a
>> core
>> part of the Beam project.
>>
>> Can you advise on the best way to go about this?
>>
>> I am aware of the use of bintray for Docker repository and my own account
>> is hooked up, but this would be a CI thing. Any pointers to docs or other
>> projects doing similar things would be super, too, and I am of course
>> happy
>> to do my homework if you can offer a pointer.
>>
>> Kenn
>>
>
>

Re: Building with docker - Best practices

2017-11-14 Thread Joan Touzet

Hi Thomas,

I can't speak to anything maven related, but on 1), you might want to
look at leveraging the new Pipeline syntax, and especially the
Declarative Pipeline syntax. This lets you have a Jenkinsfile right in
your repo that contains the entire configuration for Jenkins. Keeping
your CI configuration under version control has fantastic benefits that
should be immediately obvious to anyone who's had Jenkins log them out
while updating a job configuration through the web interface. ;) Also,
you can leverage the Multibranch Pipeline job type in Jenkins which lets
you build on one, many, or all branches in a repo when they change -
very powerful. We use this to great effect for CouchDB to build on our
release branches across 11 different Erlang/OS combinations in parallel,
then a final step to push snapshot packages to our VM for internal dev
team use.

On 3) see my comments here: 

https://lists.apache.org/thread.html/b765b2d7f15c2cc68a70f4146d7ba08753d32e4da28130dcc5e4051e@%3Cbuilds.apache.org%3E

As well as here:

https://lists.apache.org/thread.html/c240e9bce761ef25ce8a594a29cc26628f97d9546e14a9fa30066896@%3Cbuilds.apache.org%3E

I'm still of the opinion this is an overly conservative policy
standpoint, but I'm complying until such time as the policy is changed.
If enough of us disagree, perhaps we can petition to change the policy
together?

-Joan
- Original Message -
From: "Thomas Bouron" 
To: builds@apache.org
Sent: Tuesday, 14 November, 2017 12:17:50 PM
Subject: Building with docker - Best practices

Hi.

Based on suggestions on this thread[1], I started to look at how to build 
everything in our project with docker. This was surprisingly straight forward 
but I have some remaining questions. I figured I was not the only one and it 
might help people in the future so, here we go:

1. In Jenkins, rather than using a maven type job, I'm using a freestyle type 
job to call `docker run  ` during the build phase. Is it the right way to 
go?
2. My docker images are based on `maven:alpine` with few extra bits and bobs on 
top. All is working fine but, how do I configure jenkins to push built 
artifacts (SNAPSHOT) on Apache maven repo? I'm sure other projects do that but 
couldn't figure it out so far.
3. Each git submodule requiring a custom docker image will have their own 
`Dockerfile` at the root. I was planning to create an extra jenkins job to 
build and publish those images to docker hub. Does Apache has an official 
account and if yes, should we use that? Otherwise, I'll create an account for 
our project only and share the credential with our PMCs.

Best.

[1] 
https://lists.apache.org/thread.html/204d803d92e12f566323881b8e617164a29edc4790b20d361f73dd36@%3Cbuilds.apache.org%3E

Re: [DISCUSS] Deploy releases to DockerHub

2018-06-22 Thread Joan Touzet

Interesting. Previously, I was told we were not allowed to push snapshot builds
to anywhere that the public might have access - only for our development teams.

If we are now allowed to publish snapshot/nightly/checkin-based images to
DockerHub, that eases a lot of CouchDB's pain.

As far as running a manual job, that works, or you could potentially use the
Promoted Builds plugin. (We don't use that, but I've used it in the past to
promote builds for release in a CI/CD setup.)

In our case, we have to build by checking out (from git) a specific tagged
revision. The build process detects that it was checked out from a tag, and
adjusts the in-built version number to display that it is a real release,
instead of a snapshot build. Because of this, it would make more sense for
us to use a manually triggered build that takes a parameter, using the
Build With Parameters Jenkins plugin. We could then use this to force
checkout from the specific tag.

-Joan
- Original Message -
From: "Francesco ChicchiriccÃ²" 
To: builds@apache.org
Sent: Friday, June 22, 2018 11:44:44 AM
Subject: [DISCUSS] Deploy releases to DockerHub

Hi there,
at Syncope we have just enabled - thanks to Gavin's support - the deploy to 
DockerHub via Jenkins (jobs [1][2]) - see [3]; the point is that the 3 Docker 
images cannot be simply built from their respective Dockerfiles, but need to be 
generated as part of the Maven build.

Now that SNAPSHOTs are covered by [1] and [2], I was wondering how it would be 
possible to push releases to DockerHub: I thought that we could create an 
additional job on Jenkins, which we will manually run at end of our release 
process [4]; such job will run on the GIT tag of the release.

Gavin suggested to ask here for discussion, so here I am.
WDYT?

Regards.

[1] https://builds.apache.org/view/S-Z/view/Syncope/job/Syncope-2_0_X-deploy/
[2] https://builds.apache.org/view/S-Z/view/Syncope/job/Syncope-master-deploy/
[3] https://issues.apache.org/jira/browse/INFRA-16647
[4] http://syncope.apache.org/release-process#Finalize_the_release

Jenkins build hosts filling up...needs everyone's help!

2018-07-13 Thread Joan Touzet

Hi there,

Chris over in  https://issues.apache.org/jira/browse/INFRA-16768
recommended I start a thread.

We've been getting increasing numbers of failures on our builds
due to nodes running out of disk space. In 16768, Chris says:

"We are getting to the point where builds are running machines out of space 
faster than we can clear them out. I've cleaned up H24 a bit. We'll discuss 
further, but this is going to take some cooperation amongst all builders to 
start purging workspaces."

So this is the requested thread. What do we, collectively, as
Jenkins users, need to do to clean out workspaces? I'm
fairly sure that CouchDB workspaces are pretty clean, since we
do the bulk of our builds in /tmp and try our best to clean up
after failed builds. But I am happy to admit I don't know
everything I should or shouldn't be doing.

Chris, would a list of the "top offenders" be useful? I'm not
looking to shame anyone, but shedding a little light on the
approaches that are the biggest problem might help.

-Joan

Re: Jenkins build hosts filling up...needs everyone's help!

2018-07-21 Thread Joan Touzet

Yes - you can do this in your own Jenkinsfile or job description.

In a Pipeline build (declarative or procedural), use deleteDir() :
  
https://jenkins.io/doc/pipeline/steps/workflow-basic-steps/#-deletedir-%20recursively%20delete%20the%20current%20directory%20from%20the%20workspace

In a old-school build, use the Workspace Cleanup Plugin:

  https://plugins.jenkins.io/ws-cleanup

Both can be used to clean the workspace before AND after a build.
- Original Message -
From: "Dominik Psenner" 
To: builds@apache.org
Sent: Friday, July 20, 2018 3:05:27 PM
Subject: Re: Jenkins build hosts filling up...needs everyone's help!

Would there be ways to automate a rm -rf of a build workspace when the
build has completed? Doing so would make the problem disappear. Surely
builds take a little more time and disk, network and cpu usage will grow.
But builds start from a pristine workspace  which is desirable from my pov
and the full disk problem disappears too so long a single job does not use
more than the available on a build machine.

On Fri, 20 Jul 2018, 19:52 Dan Kirkwood,  wrote:

> Thanks,  Mike..
>
> I've added some better cleanup,  I hope,  using trap -- so hopefully will
> lessen our contribution to the problem.
>
> -dan
>
> On 2018/07/20 16:43:31, Mike Jumper  wrote:
> > On Thu, Jul 19, 2018, 10:36 Dan Kirkwood  wrote:
> >
> > > TrafficControl uses docker-compose to build each component,  so the
> disk
> > > space used is within docker's space -- not the workspace.   We do
> attempt
> > > to clean up after each build,  but would be happy to get advice to
> improve
> > > how we're doing it.   Are there best practices others use?
> > >
> >
> > We've been using Docker for Guacamole's Jenkins builds, as well,
> > autogenerating a Dockerfile which performs the actual build within a
> > pristine environment.
> >
> > Not sure how different things would need to be with docker-compose, but
> we
> > use a combination of the "trap" command (to ensure cleanup happens
> > regardless of build result) and the "--no-cache" and "--rm" flags:
> >
> > # Remove image regardless of build result
> > export TAG="guac-${BUILD_TAG}"
> > trap "docker rmi --force $TAG || true" EXIT
> >
> > # Perform build
> > docker build --no-cache=true --rm --tag "$TAG" .
> >
> > - Mike
> >
>

Re: Jenkins Slave Workspace Retention

2018-07-23 Thread Joan Touzet

I realize our use case is a bit different than some of the big Java
projects here, but I thought I'd give a few tips and tricks that have
helped us along the way:

Allen Wittenauer  said:
> > On Jul 23, 2018, at 12:45 AM, Gavin McDonald
> >  wrote:
> > 
> > Is there any reason at all to keep the 'workspace' dirs of builds
> > on the
> > jenkins slaves ?
> 
>   - Some jobs download and build external dependencies, using the
>   workspace directories as a cache and to avoid sending more work to
>   INFRA.  Removing the cache may greatly increase build time, network
>   bandwidth, and potentially increase INFRA’s workload.

This is why we switched to Docker for ASF Jenkins CI. By pre-building our
Docker container images for CI, we take control over the build environment
in a very proactive way, reducing Infra's investment to just keeping the
build nodes up, running, and with sufficient disk space.

The build environments supported on the build hosts in Jenkins seem very
skewed towards Java dependencies (which is fine), and our build chain is
entirely orthogonal (specific JavaScript libraries, Erlang versions, etc.)
It also lets us do things like pre-clone the source tree into the image, so
that only deltas need to be checked out on each new build that runs.

The hiccup then becomes pulling that image down onto each build node at
the start of each build step. We forcibly launch a `docker pull` command in
our Jenkinsfile, since otherwise an update to our build image may get missed.
If the `docker pull` fails, the build fails, but then again it would also
fail in the no-Docker case if dependencies could not be downloaded and run
locally.

It also means that, once a build is done, there is no mess on the Jenkins
build node to clean up - just a regular `docker rm` or `docker rmi` is
sufficient to restore disk space. Infra is already running these aggressively,
since if a build hangs due to an unresponsive docker daemon or network
failure, our post-run script to clean up after ourselves may never run.

>   - Many jobs don’t put everything into the saved artifacts due to
>   size constraints.  Removing the workspace will almost certainly
>   guarantee that artifact usage goes way up as the need to grab (or
>   cache) bits from the workspace will be impossible with an overly
>   aggressive workspace deletion policy.

We don't put everything into saved artefacts either, but we have built a
simple Apache CouchDB-based database to which we upload any artefacts we
want to save for development purposes only. (Right now, we use this
mechanism to save build logs and other files for failed runs only.) We
run this service for ourselves on couchdb-vm2.a.o. We'd be happy to
explain how you can use something similar in your own builds, if it would
help reduce the burden on the main Jenkins infra.

>   Maven, ant, etc don’t perform directory locks on local repositories.
>Separate storage areas for jars are key so that multiple executors
>   don’t step all over each other.  This was a HUGE problem for a lot
>   of jobs when multiple executors were introduced a few years ago.

We had this issue too - which is why we build under a `/tmp` directory
inside the Docker container to avoid one build trashing another build's
workspace directory via the multi-node sync mechanism.

-Joan

Re: Jenkins Slave Workspace Retention

2018-07-23 Thread Joan Touzet

HI Allen,

>   It’s also worth pointing out that “just use Docker” only works if
>   one is building on Linux.  That isn’t an option on Windows.  

Not true, you can host Windows containers inside of Windows hosts now,
apparently - though I've not tried it.

>   … and where does this DB run? 

As mentioned, we have a couchdb-vm2.apache.org VM provided by Infra for
this. We are happy with it. It is pretty much hands-off.

-Joan

Re: HBase nightly job failing forever

2018-07-25 Thread Joan Touzet

I'll speak to CouchDB - the donation is directly in the form of a Jenkins
build agent with our tag, no money is changed hands. The donator received
a letter from fundraising@a.o allowing for tax deduction on the equivalent
amount that the ASF leasing the machine would have cost for a year's
donation. We have 24x7 support on the node from the provider, who performs
all sysadmin (rather than burdening Infra with having to run puppet on our
build machine). This was arranged so we could have a FreeBSD node in the
build array.

We have another donator in the wings who will be adding a build node for
us; at that point, we expect to move all of our builds to our own Jenkins
build agents and won't be in the common pool any longer. The number of
failed builds in our stream that are directly related to this "tragedy of
the commons" far exceeds the number of successful builds at this point,
and unfortunately Travis CI is having parallel capacity issues that prevent
us from moving to them wholesale as well.

-Joan

- Original Message -
From: "Andrew Purtell" 
To: ipv6g...@gmail.com
Cc: "Andrew Purtell" , "dev" , 
builds@apache.org
Sent: Wednesday, July 25, 2018 12:22:08 PM
Subject: Re: HBase nightly job failing forever

How does a targeted hardware donation work? I was under the impression that
targeted donations are not accepted by the ASF. Maybe it is different in
infrastructure, but this is the first time I've heard of it. Who does the
donation on those projects? DataStax for Cassandra? Who for CouchDB? Google
for Beam? By what process are the donations made and how are they audited
to confirm the donation is spent on the desired resources? Can we get a
contact for one of them for testimonial regarding this process? Is this
process documented?

On Tue, Jul 24, 2018 at 4:27 PM Gav  wrote:

> Hi Andrew,
>
> On Wed, Jul 25, 2018 at 3:21 AM Andrew Purtell 
> wrote:
>
>> Thanks for this note.
>>
>> I'm release managing the 1.4 release. I have been running the unit test
>> suite on reasonably endowed EC2 instances and there are no observed always
>> failing tests. A few can be flaky. In comparison the Apache test resources
>> have been heavily resource constrained for years and frequently suffer from
>> environmental effects like botched settings, disk space issues, and
>> contention with other test executors.
>>
>
> Our Jenkins nodes are configured via puppet these days and are pretty
> stable, to which settings do you know of that might (still) be botched?
> Yes, resources are shared and on occasion run to capacity. This is one
> reason for my initial mail - these HBase builds are consuming 10 or more
> executors
> -at the same time- and are starving executors for other builds. The fact
> these tests have been failing for well over a month and that you mention
> below  will be
> ignoring them does not make for good cross ASF community spirit, we are
> all in this together and every little bit helps. This is not a target at
> one project, others
> will be getting a similar note and I hope we can come to a resolution
> suitable for all.
> Disk space issues , yes, not on most of the Hadoop and related projects
> nodes - H0-H12 do not have disk space issues. As a Hadoop related project
> HBase should really be concentrating its builds there.
>
>
>> I think a 1.4 release will happen regardless of the job test results on
>> Apache infrastructure. I tend to ignore them as noisy and low signal.
>> Others in the HBase community don't necessarily feel the same, so please
>> don't take my viewpoint as particularly representative. We could try Alan's
>> suggestion first, before ignoring them outright.
>>
>
> No problem
>
>
>> Has anyone given thought toward expanding the pool of test build
>> resources? Or roping in cloud instances on demand? Jenkins has support for
>> that.
>>
>
> We have currently 19 Hadoop specific nodes available H0-H19 and another 28
> or so general use 'ubuntu' nodes for all to use. In addition we have
> projects
> that have targetted donated resources and the likes of Cassandra, CouchDB
> and Beam all have multiple nodes on which they have priority. I'll throw an
> idea
> out there than perhaps HBase could do something similar to increase our
> node pool and at the same time have priority on a few nodes f their own via
> a targeted
> hardware donation.
> Cloud on demand has been tried a year or two ago, we will revisit this
> also soon.
>
> Summary then, we currently have over 80 nodes connected to our Jenkins
> master - what figure did you have in mind when you say 'expanding the pool
> of test build resources' ?
>
> Thanks
>
> Gav...
>
>
>>
>> On Tue, Jul 24, 2018 at 9:16 AM Allen Wittenauer
>>  wrote:
>>
>>> I suspect the bigger issue is that the hbase tests are running
>>> on the ‘ubuntu’ machines. Since they only have ~300GB for workspaces, the
>>> hbase tests are eating a significant majority of it and likely could be
>>> dying randomly due to space issues.  [All the hbase workspace

Re: New Jenkins Nodes and some rolling maintenance

2018-10-22 Thread Joan Touzet

Thanks for the info, Gavin.

I (re-)started a job about 45 minutes ago and the job is unable to find any
nodes labelled 'ubuntu.' I tried restarting it again just now, and the same
problem occurred. Is this related - should we expect a full Jenkins outage
right now?

Text from the job:

"H13 doesn’t have label ubuntu; H18 doesn’t have label ubuntu; H27 is offline; 
H28 is offline; H29 is offline; H30 is offline; H31 is offline; H32 is offline; 
H33 is offline; H34 is offline; H35 is offline; Jenkins doesn’t have label 
ubuntu; arm1 doesn’t have label ubuntu; beam1 doesn’t have label ubuntu; beam10 
doesn’t have label ubuntu; beam11 doesn’t have label ubuntu; beam12 doesn’t 
have label ubuntu; beam13 doesn’t have label ubuntu; beam14 doesn’t have label 
ubuntu; beam15 doesn’t have label ubuntu; beam16 doesn’t have label ubuntu; 
beam2 doesn’t have label ubuntu; beam3 doesn’t have label ubuntu; beam4 doesn’t 
have label ubuntu; beam5 doesn’t have label ubuntu; beam6 doesn’t have label 
ubuntu; beam7 doesn’t have label ubuntu; beam8 doesn’t have label ubuntu; beam9 
doesn’t have label ubuntu; cassandra1 doesn’t have label ubuntu; cassandra10 
doesn’t have label ubuntu; cassandra11 doesn’t have label ubuntu; cassandra12 
doesn’t have label ubuntu; cassandra13 doesn’t have label ubuntu; cassandra14 
doesn’t have label ubuntu; cassandra15 doesn’t have label ubuntu; cassandra16 
doesn’t have label ubuntu; cassandra2 doesn’t have label ubuntu; cassandra3 
doesn’t have label ubuntu; cassandra4 doesn’t have label ubuntu; cassandra5 
doesn’t have label ubuntu; cassandra6 doesn’t have label ubuntu; cassandra7 
doesn’t have label ubuntu; cassandra8 doesn’t have label ubuntu; cassandra9 
doesn’t have label ubuntu; couchdb-freebsd1 doesn’t have label ubuntu; 
couchdb-macos1 doesn’t have label ubuntu; hadoop-ppc64le-1 doesn’t have label 
ubuntu; hadoop-win1 doesn’t have label ubuntu; plc4x1 doesn’t have label 
ubuntu; tapestry doesn’t have label ubuntu; ubuntu-2 is offline; ubuntu-ppc64le 
doesn’t have label ubuntu; websites1 doesn’t have label ubuntu; windows-2012-1 
doesn’t have label ubuntu; windows-2012-2 doesn’t have label ubuntu; 
windows-2012-3 doesn’t have label ubuntu; windows-2016-1 doesn’t have label 
ubuntu; windows-2016-2 doesn’t have label ubuntu; windows-2016-3 doesn’t have 
label ubuntu"
- Original Message -
From: "Gavin McDonald" 
To: "builds" , operati...@apache.org
Sent: Saturday, October 20, 2018 5:07:36 PM
Subject: New Jenkins Nodes and some rolling maintenance

Hi All,

oath have donated some more nodes - H36 to H43 are online now and accepting
builds.
These nodes have 4TB (3.6TB real) disks and so will not suffer the recent
issues some of our other donated nodes have.

Speaking of - all of the oath donated nodes H12 to H35 which currently give
us disk issues  - they have 500GB (364GB real) disk - are getting a make
over and an increase in disk space to 4TB (3.6TB) to match H0 to H11 and
H36 to H43.

These will be done in batches - H27 to H35 are down now in readiness to be
done this week.

Thanks for your patience whilst we do this , and, enjoy the extra 8 nodes.

Gav...

Re: New Jenkins Nodes and some rolling maintenance

2018-10-22 Thread Joan Touzet

And just like that, it's started working again.

Thanks for all your (collective) hard work on keeping Jenkins up and running
for all of us. I know it's not easy.

All the best,
Joan
- Original Message -----
From: "Joan Touzet" 
To: builds@apache.org, gmcdon...@apache.org
Cc: operati...@apache.org
Sent: Monday, October 22, 2018 7:20:47 PM
Subject: Re: New Jenkins Nodes and some rolling maintenance

Thanks for the info, Gavin.

I (re-)started a job about 45 minutes ago and the job is unable to find any
nodes labelled 'ubuntu.' I tried restarting it again just now, and the same
problem occurred. Is this related - should we expect a full Jenkins outage
right now?

Text from the job:

"H13 doesn’t have label ubuntu; H18 doesn’t have label ubuntu; H27 is offline; 
H28 is offline; H29 is offline; H30 is offline; H31 is offline; H32 is offline; 
H33 is offline; H34 is offline; H35 is offline; Jenkins doesn’t have label 
ubuntu; arm1 doesn’t have label ubuntu; beam1 doesn’t have label ubuntu; beam10 
doesn’t have label ubuntu; beam11 doesn’t have label ubuntu; beam12 doesn’t 
have label ubuntu; beam13 doesn’t have label ubuntu; beam14 doesn’t have label 
ubuntu; beam15 doesn’t have label ubuntu; beam16 doesn’t have label ubuntu; 
beam2 doesn’t have label ubuntu; beam3 doesn’t have label ubuntu; beam4 doesn’t 
have label ubuntu; beam5 doesn’t have label ubuntu; beam6 doesn’t have label 
ubuntu; beam7 doesn’t have label ubuntu; beam8 doesn’t have label ubuntu; beam9 
doesn’t have label ubuntu; cassandra1 doesn’t have label ubuntu; cassandra10 
doesn’t have label ubuntu; cassandra11 doesn’t have label ubuntu; cassandra12 
doesn’t have label ubuntu; cassandra13 doesn’t have label ubuntu; cassandra14 
doesn’t have label ubuntu; cassandra15 doesn’t have label ubuntu; cassandra16 
doesn’t have label ubuntu; cassandra2 doesn’t have label ubuntu; cassandra3 
doesn’t have label ubuntu; cassandra4 doesn’t have label ubuntu; cassandra5 
doesn’t have label ubuntu; cassandra6 doesn’t have label ubuntu; cassandra7 
doesn’t have label ubuntu; cassandra8 doesn’t have label ubuntu; cassandra9 
doesn’t have label ubuntu; couchdb-freebsd1 doesn’t have label ubuntu; 
couchdb-macos1 doesn’t have label ubuntu; hadoop-ppc64le-1 doesn’t have label 
ubuntu; hadoop-win1 doesn’t have label ubuntu; plc4x1 doesn’t have label 
ubuntu; tapestry doesn’t have label ubuntu; ubuntu-2 is offline; ubuntu-ppc64le 
doesn’t have label ubuntu; websites1 doesn’t have label ubuntu; windows-2012-1 
doesn’t have label ubuntu; windows-2012-2 doesn’t have label ubuntu; 
windows-2012-3 doesn’t have label ubuntu; windows-2016-1 doesn’t have label 
ubuntu; windows-2016-2 doesn’t have label ubuntu; windows-2016-3 doesn’t have 
label ubuntu"
- Original Message -
From: "Gavin McDonald" 
To: "builds" , operati...@apache.org
Sent: Saturday, October 20, 2018 5:07:36 PM
Subject: New Jenkins Nodes and some rolling maintenance

Hi All,

oath have donated some more nodes - H36 to H43 are online now and accepting
builds.
These nodes have 4TB (3.6TB real) disks and so will not suffer the recent
issues some of our other donated nodes have.

Speaking of - all of the oath donated nodes H12 to H35 which currently give
us disk issues  - they have 500GB (364GB real) disk - are getting a make
over and an increase in disk space to 4TB (3.6TB) to match H0 to H11 and
H36 to H43.

These will be done in batches - H27 to H35 are down now in readiness to be
done this week.

Thanks for your patience whilst we do this , and, enjoy the extra 8 nodes.

Gav...

Re: Can we package release artifacts on builds.a.o?

2018-12-08 Thread Joan Touzet

I would like to see support for something like this as well, even if it came 
down to individual VMs/donated HW per project, locked down by project - only 
project X can use build machine X'.

Automated repeatable builds actually *increases* trust vs. who knows what a 
release manager has running on their workstation. At this point, I trust Docker 
builds with published, auditable cryptographic hashes per layer more than I 
trust some Apache releases.

I don't actually believe that all projects in the Apache world are actually 
following the strict edict of "human must run the build and push any binary 
release," but I'm not going to point fingers.

-Joan
- Original Message -
From: "Alex Harui" 
To: builds@apache.org
Sent: Saturday, December 8, 2018 12:43:37 PM
Subject: Re: Can we package release artifacts on builds.a.o?

Gavin, Alan, Karl,

Thanks for the information.

This email implies that there is a Jenkins node that can commit something.  
What creds are used for that?  Is there a buildbot user?
https://lists.apache.org/thread.html/efed1ff44fbfe5770ea1574b2f53a5295ae8326c5a3a5feb9f88cd48@%3Cbuilds.apache.org%3E

If so, I was imagining the following workflow:

1) Jenkins runs Maven release.  I forgot about the PGP signing part.  If there 
is no way to skip it, then can a buildbot "user" PGP sign it?
2) RM downloads the artifacts and verifies them.  The source package has to 
match the tag so I think that would detect any injections from other stuff 
running in Jenkins or elsewhere on the build server.  There's been a recent 
discussion on reproducible binaries and if this workflow is approved I would 
make our binaries are reproducible, and that should again detect any injections 
from the build server.
3) RM adds his/her PGP signature to the artifacts.  Not sure if there is a 
Maven way to do that.
4) Voting and other steps follow from there.

These would not be continuously running jobs.  They would have to be kicked off 
manually so it shouldn't add significant load, and we would know which commits 
came from buildbot so we could detect if anything went funky.

Thoughts?
-Alex

On 12/8/18, 7:54 AM, "Gavin McDonald"  wrote:

additionally, nobody should have their creds stored anyway other than their
own machine.

On Sat, Dec 8, 2018 at 3:49 PM Allen Wittenauer
 wrote:

>
>
> > On Dec 7, 2018, at 11:56 PM, Alex Harui 
> wrote:
> >
> >
> >
> > On 12/7/18, 10:49 PM, "Allen Wittenauer" 

> wrote:
> >
> >
> >
> >> On Dec 7, 2018, at 10:22 PM, Alex Harui 
> wrote:
> >>
> >> Maven's release plugins commit and push to Git and upload to
> repository.a.o.  I saw that some folks have a node that can commit to the
> a.o website SVN.  Is anyone already doing releases from builds?  What
> issues are there, if any?
> >
> >   It's just flat out not secure enough to do a release on.
> >
> > Can you give me an example of how it isn't secure enough?
>
>
> The primary purpose of these servers is to run untested,
> unverified code.
>
> Jenkins has some very sharp security corners that makes it
> trivially un-trustable.  Something easy to understand: when Jenkins is
> configured to run multiple builds on a node, all builds on that node run 
in
> the same user space. Because there is no separation between executors, 
it's
> very possible for anyone to execute something that modifies another 
running
> build.  For example, probably the biggest bang for the least amount of 
work
> would be to replace jars in the shared maven cache.
>
> [... and no, Docker doesn't help.]
>
> There are other, bigger problems, but I'd rather not put that out
> in the public.
>
>
>

-- 
Gav...

Re: Can we package release artifacts on builds.a.o?

2018-12-11 Thread Joan Touzet

Back on this topic, the recent post on Jenkins has me thinking again.

Jenkins users are deploying directly to Nexus with builds.

Isn't that speaking out of both sides of our mouths at the same time, if Java 
developers can push release builds directly to Nexus but non-Java developers 
can't?

Perhaps I'm misunderstanding...are the Nexus-published builds not treated the 
same because they're not on dist.apache.org? Or are they not release versions?

I'm just asking for equal treatment here.

-Joan

- Original Message -
From: "Alex Harui" 
To: builds@apache.org, woh...@apache.org
Sent: Saturday, December 8, 2018 1:52:14 PM
Subject: Re: Can we package release artifacts on builds.a.o?

Good to know it isn't just me.

I could be wrong, but I believe the "policy" at Apache is only that a human 
must verify the packages and PGP sign them.  The packages can be built on 
another machine.

-Alex

On 12/8/18, 10:48 AM, "Joan Touzet"  wrote:

I would like to see support for something like this as well, even if it 
came down to individual VMs/donated HW per project, locked down by project - 
only project X can use build machine X'.

Automated repeatable builds actually *increases* trust vs. who knows what a 
release manager has running on their workstation. At this point, I trust Docker 
builds with published, auditable cryptographic hashes per layer more than I 
trust some Apache releases.

I don't actually believe that all projects in the Apache world are actually 
following the strict edict of "human must run the build and push any binary 
release," but I'm not going to point fingers.

-Joan
- Original Message -
From: "Alex Harui" 
To: builds@apache.org
Sent: Saturday, December 8, 2018 12:43:37 PM
Subject: Re: Can we package release artifacts on builds.a.o?

Gavin, Alan, Karl,

Thanks for the information.

This email implies that there is a Jenkins node that can commit something.  
What creds are used for that?  Is there a buildbot user?

https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fthread.html%2Fefed1ff44fbfe5770ea1574b2f53a5295ae8326c5a3a5feb9f88cd48%40%253Cbuilds.apache.org%253E&data=02%7C01%7Caharui%40adobe.com%7C5b7803529e3943bd483808d65d3daf70%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636798916859824875&sdata=vJouQVPXMtHyvxOo%2BZuhhi0TmAdw9nJYPTC3HTaj1%2B0%3D&reserved=0

If so, I was imagining the following workflow:

1) Jenkins runs Maven release.  I forgot about the PGP signing part.  If 
there is no way to skip it, then can a buildbot "user" PGP sign it?
2) RM downloads the artifacts and verifies them.  The source package has to 
match the tag so I think that would detect any injections from other stuff 
running in Jenkins or elsewhere on the build server.  There's been a recent 
discussion on reproducible binaries and if this workflow is approved I would 
make our binaries are reproducible, and that should again detect any injections 
from the build server.
3) RM adds his/her PGP signature to the artifacts.  Not sure if there is a 
Maven way to do that.
4) Voting and other steps follow from there.

These would not be continuously running jobs.  They would have to be kicked 
off manually so it shouldn't add significant load, and we would know which 
commits came from buildbot so we could detect if anything went funky.

Thoughts?
-Alex

On 12/8/18, 7:54 AM, "Gavin McDonald"  wrote:

additionally, nobody should have their creds stored anyway other than 
their
own machine.

On Sat, Dec 8, 2018 at 3:49 PM Allen Wittenauer
 wrote:

>
>
> > On Dec 7, 2018, at 11:56 PM, Alex Harui 
> wrote:
> >
> >
> >
> > On 12/7/18, 10:49 PM, "Allen Wittenauer" 

> wrote:
> >
> >
> >
> >> On Dec 7, 2018, at 10:22 PM, Alex Harui 
> wrote:
> >>
> >> Maven's release plugins commit and push to Git and upload to
> repository.a.o.  I saw that some folks have a node that can commit to 
the
> a.o website SVN.  Is anyone already doing releases from builds?  What
> issues are there, if any?
> >
> >   It's just flat out not secure enough to do a release on.
> >
> > Can you give me an example of how it isn't secure enough?
>
>
> The primary purpose of these servers is to run untested,
> unverified code.
>
> Jenkins has some very sharp security corners tha

Re: Can we package release artifacts on builds.a.o?

2018-12-11 Thread Joan Touzet

- Original Message -
Allen Wittenauer  wrote:
> > On Dec 11, 2018, at 9:09 AM, Joan Touzet  wrote:
> > Perhaps I'm misunderstanding...are the Nexus-published builds not treated 
> > the same because they're not on dist.apache.org? Or are they not release 
> > versions?
>   Yes, you are misunderstanding.
>   1) Officially (legally?), source code distributions are "the release."  
> Any and all binaries are considered to be convenience binaries so users don’t 
> have to  compile.  They are not official.   [Statements like “verify a 
> release by rebuilding” don’t really parse as a result.]
>   2) As far as I’m aware/all the projects I’ve ever worked with, the 
> uploads to Nexus are to the snapshot repo, not the release repo.  The release 
> repos are still done manually. 

Thanks, Allen. So I am still fighting against the system here.

If binaries are conveniences, and they are not official, we should be able to 
auto-push binaries built on trusted infrastructure out to the world. Why can't 
that be our (Infra maintained & supported, costly from a non-profit 
perspective) CI/CD infrastructure?

-Joan

Re: Non committer collaborators on GitHub

2018-12-14 Thread Joan Touzet

Allen Wittenauer wrote:

>> On Dec 14, 2018, at 3:57 AM, Zoran Regvart  wrote:
>> And, probably the best one, is to have a ASF wide GitHub account that
>> builds can use.
> 
> I do think because of how Github works, an ASF-wide one is probably too 
> dangerous.  But I can’t see why private@project accounts couldn’t be added so 
> long as folks don’t do dumb things like auto-push code.  There has to be a 
> level of trust here unfortunately though which is why it may not come to 
> fruition. :(
> 
> Side-rant:
> 
> I think part of the basic problem here is that Github’s view of permissions 
> is really awful.  It is super super dumb that accounts have to have 
> admin-level privileges for repos to use the API to do some basic things that 
> can otherwise be gleaned by just scraping the user-facing website.  If anyone 
> from Github is here, I’d love to have a chat. ;)

FYI I've previously been told we can't use addons to GitHub to improve
the issue management workflow (like https://waffle.io/) precisely
because GitHub's permissions model is so poor, allowing an external
tool to move tickets around requires giving it effectively commit
access, which is forbidden to third parties.

Very annoying, because our project staff fully endorsed moving off of
JIRA (because they hated the interface) onto GitHub Issues, but now
we are somewhat impoverished by the minimalist approach GH takes
towards project management. Waffle would solve basically all of those
problems for us.

-Joan

Re: workspace cleanups needed on jenkins master

2018-12-27 Thread Joan Touzet

Hi there,
- Original Message -
> From: "Chris Lambertus" 

> As a rule of thumb, we’d like to see
> projects retain no more than 1 week or 7 builds worth of historical
> data at the absolute maximum.

> 54 GB ./CouchDB

Our config is:

Discard old items: checked
Days to keep old items: 7 (for a long while now)
Max # of old items to keep: 7 (newly changed - was blank)

Can you let us know if this helps?

I'm willing to look at more aggressive changes to reduce space,
but would like to have a better understanding of how that space
is being used. At a start it looks like we could try adding a
deleteDir() command to our post { failure { ... } } block 
(declarative pipeline.)

I'm off until Jan 6 on a desperately-needed vacation, so I can't
really dig in until then.

-Joan

Re: workspace cleanups needed on jenkins master

2018-12-27 Thread Joan Touzet

> CouchDB has 300+ builds from July taking up 100-150MB each. I’m not
> sure why these wouldn’t have been removed by the ‘days to keep old
> items’ parameter, so we’ll need to look into that on the infra side
> as well. If the addition of the ‘max # of old items’ parameter
> doesn’t purge them after the next build, we can remove them
> manually. If infra does have to go and do some manual pruning, these
> are the types of things we’ll be looking for and removing.
> 
> Just looking at the build history in the Jenkins UI for the CouchDB
> ‘master’ branch, it’s clear that the ‘days to keep old items’
> setting doesn’t seem to be working at all, since you have saved
> builds back to November. Something else we’ll need to look into. I
> doubt you were even aware of the July artifacts (I’m not sure how
> you’d be able to see those.)

That's odd, thanks for letting us know, Chris.

If INFRA can fix the problem as to why the very old builds are still
hanging about, hopefully you won't have to resort to such drastic
measures as deleting everything older than a certain date.

We were going to look into using the Promoted Builds plugin in the new
year to save builds for RCs and released versions; if you code something
to delete old builds, would it be possible for it to not just blanket
delete things older than a certain amount? The idea, again, is repeatable
builds towards release-able assets with saved logs for traceability and
audit trail. If we can't use Jenkins for this, we'll have to invent our
own system (at considerable labour/expsnse.)

-Joan

Re: PRJenkins builds for Projects

2019-01-04 Thread Joan Touzet



- Original Message -
> From: "Allen Wittenauer" 

>   This is the same model the ASF has used for JIRA for a decade+.
>It’s always been possible for anyone to submit anything to Jenkins
>   and have it get executed. Limiting PRs or patch files in JIRAs to
>   just committers is very anti-community. (This is why all this talk
>   about using Jenkins for building artifacts I find very
>   entertaining.  The infrastructure just flat out isn’t built for it
>   and absolutely requires disposable environments.)

Then we build a new, additional Jenkins that is committer-only (or PMC-
only, perhaps, if it's for release purposes). This is a tractable
problem.

We are stuck at an impasse where people need something to reduce the
manual workload, and we have an obsolete policy standing in its way.
We must be the last organisation in the world where people are forced
to release software through a manual process.

I don't see why this is something to be gleeful about.

-Joan

Re: PRJenkins builds for Projects

2019-01-07 Thread Joan Touzet

See travis-ci.org.

This is the model we could be emulating.
- Original Message -
From: "Alex Harui" 
To: builds@apache.org
Sent: Sunday, January 6, 2019 6:53:44 PM
Subject: Re: PRJenkins builds for Projects

What other organizations are running a similar patch/pr Jenkins capability and 
how do they implement "security" to prevent exploits like bitcoin miners and 
other attacks?

IMO, if you give free compute resources, the bad people will eventually figure 
out how to use it to their advantage.

-Alex

On 1/6/19, 10:52 AM, "Allen Wittenauer"  
wrote:

> On Jan 6, 2019, at 10:43 AM, Dominik Psenner  wrote:
> 
> On Sun, Jan 6, 2019, 19:32 Allen Wittenauer
>  
>> 
>> a) The ASF has been running untrusted code since before Github existed.
>> From my casual watching of Jenkins, most of the change code we run 
doesn’t
>> come from Github PRs.  Any solution absolutely needs to consider what
>> happens in a JIRA-based patch file world. [footnote 1,2]
>> 
> 
> If some project build begins to draw resources in an extraordinary fashion
> it will be noticed.

Strongly disagree. My cleaner code killed three stuck surefire jobs 
that had been looping on a handful of cores since sometime in 2018 yesterday.  
The sling jobs I noted earlier in the week had 20GB of RAM.   That’s even 
before we get into the unit-tests-that-are-really-integration-tests that are 
coming from the big data projects where gigs of memory and thousands of process 
slots are consumed on a regular basis.

Re: Can we package release artifacts on builds.a.o?

2019-01-07 Thread Joan Touzet

> Within the Apache Subversion project, have tooling[1] to assist an RM
> with
> pretty much all the steps of a release. From reading this thread, it
> seems
> like Royale's problem is getting RMs up to speed, so maybe it can be
> solved
> with additional build-side tooling?
> 
> [1] https://svn.apache.org/repos/asf/subversion/trunk/tools/dist/

This doesn't solve Alex's problem of multiple complex Windows setups,
near as I can tell. I believe this is why he is asking for a "single
machine" that is set up perfectly for his needs.

I believe virtualisation is the right answer to this, not a singleton
machine that has all of the binaries on it for all projects'
build tool chains. From prior experience I know how easy it is for
project A to mess up project B's build tool chain. But I'm not sure
there is a good answer for this other than "build your own Docker image
and start your build inside of that."

I realize a large % of ASF projects are Java, and it's easier to contain
these things when you have a single, versioned runtime, but given the
mention of .NET runtimes I think we have to consider the larger picture
(which also contains our dilemma - a massive, complex build chain that
can take DAYS to install and configure correctly by hand on Windows.)

This of course is in addition to the ability for a project to create
a commit using a bot.

> Then make that git repo a local clone, hmm?
>
> If you're talking a public one, then what is the "ask" from Infra for this
> repo? Every PMC can self-serve create git repositories as they need them.
> So it would seem "do that", then you'd need to ask for some extra authz to
> enable the bot for that one repository? And what is the mechanism to
> prevent leakage of released code into that repository? Or, say, the bot
> adjusting pom.xml to pull in malware from $bad ?

Right now a PMC can't self-serve create a git repo that can *only* be
written to by a single user (the bot's account), just ones that can be
written to by all committers in their LDAP group.

Perhaps we need the ability to create repos that are writable only
by the PMCs. I can see other uses for this (like our couchdb-admin repo).

I would trust a release repo of this sort that could be audited prior to
release time, as well as if legal concerns arose.

-Joan

Re: PRJenkins builds for Projects

2019-01-08 Thread Joan Touzet

Alex:

A short list, not comprehensive:

0) Bitcoin mining is against the Travis CI ToS.
   https://docs.travis-ci.com/legal/terms-of-service/
1) There are maximum job run times in Travis that prevent unbounded
   compute, regardless of technology.
2) Travis blacklist a bunch of IPs.
3) Travis have some proprietary heuristics that look for and kill
   bitcoin mining jobs
4) Travis can be configured to only run builds on specific branches,
   i.e. the build only runs against your master branch, or you can
   have the run only run with the proposed merge of the PR into the
   master branch.
4) There is always human review involved at some point.

As for uploading built artifacts automatically:

  https://docs.travis-ci.com/user/uploading-artifacts/

And doing commits:

https://stackoverflow.com/questions/42253765/getting-travisci-to-commit-and-push-a-modified-file-with-tags-releases#42299765

Encrypting credentials:

  https://docs.travis-ci.com/user/encryption-keys/

"Please note that encrypted environment variables are not available for pull 
requests from forks."

It's this last point that prevents unauthorized contributors (i.e.,
the general public we wring our hands about) from using your encrypted
credentials to do whatever. Anyone who has write access to the repo
already (i.e., CLA-signed committers) and makes their branch on the
repo itself would have access - but if you don't trust your committers,
who can you trust?

I believe this is the missing piece for Jenkins CI.

-Joan

- Original Message -
> From: "Alex Harui" 
> To: builds@apache.org, woh...@apache.org
> Sent: Monday, January 7, 2019 1:06:27 PM
> Subject: Re: PRJenkins builds for Projects
> 
> Stephen, Joan,
> 
> Thanks for the pointers, but could you save me some time and explain
> how they implement "security" so folks can't run bitcoin miners via
> the PRs?
> 
> Thanks,
> -Alex
> 
> On 1/7/19, 7:54 AM, "Joan Touzet"  wrote:
> 
> See travis-ci.org.
> 
> This is the model we could be emulating.
> - Original Message -
> From: "Alex Harui" 
> To: builds@apache.org
> Sent: Sunday, January 6, 2019 6:53:44 PM
> Subject: Re: PRJenkins builds for Projects
> 
> What other organizations are running a similar patch/pr Jenkins
> capability and how do they implement "security" to prevent
> exploits like bitcoin miners and other attacks?
> 
> IMO, if you give free compute resources, the bad people will
> eventually figure out how to use it to their advantage.
> 
> -Alex
> 
> On 1/6/19, 10:52 AM, "Allen Wittenauer"
>  wrote:
> 
> 
> 
> > On Jan 6, 2019, at 10:43 AM, Dominik Psenner
> >  wrote:
> > 
> > On Sun, Jan 6, 2019, 19:32 Allen Wittenauer
> >  > 
> >> 
> >> a) The ASF has been running untrusted code since before
> >> Github existed.
> >> From my casual watching of Jenkins, most of the change
> >> code we run doesn’t
> >> come from Github PRs.  Any solution absolutely needs to
> >> consider what
> >> happens in a JIRA-based patch file world. [footnote 1,2]
> >> 
> > 
> > If some project build begins to draw resources in an
> > extraordinary fashion
> > it will be noticed.
> 
>   Strongly disagree. My cleaner code killed three stuck
>   surefire jobs that had been looping on a handful of cores
>   since sometime in 2018 yesterday.  The sling jobs I noted
>   earlier in the week had 20GB of RAM.   That’s even before
>   we get into the
>   unit-tests-that-are-really-integration-tests that are
>   coming from the big data projects where gigs of memory and
>   thousands of process slots are consumed on a regular basis.
> 
> 
>

Re: PRJenkins builds for Projects

2019-01-10 Thread Joan Touzet

> > I believe this is the missing piece for Jenkins CI.
> 
> Nope. Though configuring the behaviour for untrusted refs is a bit of
> a dark magic. For one the Authorize Project plugin was implemented
> without anyone paying attention to the permissions stuff in the
> Credentials plugin... so there are some minor pitfalls there...
> mostly around people not actually understanding what the different
> credentials stores are for. Then the SCM API trusted refs stuff is
> poorly understood... and finally on top of all that Pipeline
> currently runs the Groovy script on the master so you cannot verify
> untrusted refs that change the Jenkinsfile while having the security
> protections.
> 
> But you can most certainly set up Jenkins to have access to a user's
> deployment credentials when triggered by the user wanting to deploy
> while preventing PRs from accessing those credentials... However it
> probably requires a Jenkins Ninja such as myself, KK, Jesse or Oleg
> to set it up!
> 
> New initiatives in Jenkins will help make these things accessible to
> people not intimately aware of the finer details of how Jenkins
> works

I'm willing to believe that Jenkins, the software, is incapable of
this, though more detail would be nice rather than just "trust me,
it's hard."

What about buildbot? Or another technology we could use with INFRA's
support? Last time I looked at buildbot, its integration with Docker
was very poor.

I don't have any special attachment to Jenkins.

-Joan

Re: External CI Service Limitations

2019-07-03 Thread Joan Touzet

(With my CouchDB release engineer hat on only)

Anyone know if any of these external services supports platforms other
than amd64/x86_64?

CouchDB keeps receiving a lot of pressure to build on aarch64, ppc64le
and s390x, which keeps pushing us back to Jenkins CI (ASF or
independent). And if we have to do that, then not much else matters to us.

-Joan

On 2019-07-03 3:54, Jarek Potiuk wrote:
> I spoke to Kamil - Gitlab CI maintainer (in CC:) and he will speak to CEO
> of GitLab and Product Managers of GitLab CI whether GitLab will be willing
> to help with it.
> 
> J.
> 
> On Wed, Jul 3, 2019 at 9:33 AM Jarek Potiuk 
> wrote:
> 
>> Actually speaking of Gitlab CI. I realised my close friend is actually THE
>> maintainer and main person responsible for Gitlab CI. I will reach out to
>> him and see if they can help with this and provide free service. Shame I
>> have not thought about it before.
>>
>> J.
>>
>> On Wed, Jul 3, 2019 at 8:37 AM Allen Wittenauer
>>  wrote:
>>
>>>
>>>
 On Jul 2, 2019, at 11:12 PM, Jeff MAURY  wrote:

 Azure pipeline vas the big plus of supporting Linux Windows and macos
>>> nodes
>>>
>>> There’s a few that support various combinations of non-Linux.
>>> Gitlab CI has been there for a while.  Circle CI has had OS X and is in
>>> beta with Windows.  Cirrus CI has all those plus FreeBSD. etc, etc.  It’s
>>> quickly becoming required that cloud-based CI systems do more than just
>>> throw up a Linux box.
>>>
 And i think you can add you nodes to the pools
>>>
>>> I think they are limited to being on Azure tho, IIRC.  But I’m
>>> probably not.  I pretty much gave up on doing anything serious with it.
>>>
>>> I really wanted to like pipelines.  The UI is nice.  But in the
>>> end, Pipelines was one of the more frustrating ones to work with in my
>>> experience—and that was with some help from the MS folks. It suffers by a
>>> death of a thousand cuts (lack of complex, real-world examples, custom
>>> docker binary, pre-populated bits here and there, a ton of env vars,
>>> artifact system is a total disaster, etc, etc).  Lots of small problems
>>> that add up to just not being worth the effort.
>>>
>>> Hopefully it’s improved since I last looked at it months and
>>> months ago though.
>>
>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea  | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] 
>>
>>
>

Re: TravisCI: various build failures - anyone else?

2019-07-03 Thread Joan Touzet


Looks like a known Travis CI bug:

https://travis-ci.community/t/install-jdk-sh-failing-for-openjdk9-and-10/3998

On 2019-07-03 4:58 p.m., P. Ottlinger wrote:

Hi *,

since roughly a week ago we experience strange build failures on Travis
such as

https://travis-ci.org/apache/incubator-tamaya/builds/553877325



Using custom target: /home/travis/openjdk9
ln: failed to create symbolic link
‘/home/travis/openjdk9/lib/security/cacerts’: No such file or directory
The command "~/bin/install-jdk.sh --target "/home/travis/openjdk9"
--workspace "/home/travis/.cache/install-jdk" --feature "9" --license
"GPL" --cacerts" failed and exited with 1 during .
Your build has been stopped.


Anyone else? Any ideas?

Thanks,
Phil

Re: External CI Service Limitations

2019-07-03 Thread Joan Touzet





On 2019-07-03 5:57 p.m., Greg Stein wrote:

On Wed, Jul 3, 2019 at 11:36 AM Allen Wittenauer
 wrote:

...



CouchDB keeps receiving a lot of pressure to build on aarch64, ppc64le
and s390x, which keeps pushing us back to Jenkins CI (ASF or
independent). And if we have to do that, then not much else matters to

us.

 One of the nice things about using a system that supports external
runners is that it allows for contributions of CPU time from like minded
individuals.  I wouldn’t trust them to do anything more than run tests
though.



Right. We have a number of external machines contributed to our buildbot
network. An openbsd box, a Mac box, etc. They run a bunch of svn tests.

We have some external machines using JNLP to hook into our Jenkins network.

If somebody can find a $platform box they need, then it can be hooked in to
our network for a single overview console of tests (assuming you're already
using some of apache's systems already).


Yup, we're good here, though a lot of those donated boxes are single 
points of failure :(


I was asking if any of the service platforms provided this. So far, it 
looks like no.




Cheers,
-g

Re: External CI Service Limitations

2019-07-16 Thread Joan Touzet

Hi Raymond,

Would this 50,000 CI minutes per month be spread across the entire ASF,
or just each project? With >300 projects here, that's potentially
50,000 * 300 = 15 million minutes we're talking about.

What happens when a project exceeds that amount of minutes? Busy
projects that build each PR, and the build/test cycle takes let's say 30
minutes * 3 configuration = ~100 minutes per PR, would consume these
minutes with just 100 PRs (or incremental pushes to each PR). That's not
much time.

-Joan

On 2019-07-16 16:20, Raymond Paik wrote:
> Jarek,
> 
> You're not required to migrate your repo over to GitLab. We have other
> projects that keep their source code in GitHub, but are using GitLab for
> CI. Hope this helps...
> 
> Thanks,
> 
> Ray
> 
> On Tue, Jul 16, 2019 at 12:51 PM Jarek Potiuk 
> wrote:
> 
>> Yep we use Git indeed but we have Github repo (
>> https://github.com/apache/airflow)  and I believe this is pretty much
>> standard for all Apache projects (adding Greg as well).
>>
>> I don't think (or am I wrong?) the open source program directly applies in
>> this case because we would have to have GitLab Repo as well, but in our
>> case we really need GitLab CI integration with GitHub repository.
>>
>> Would that be possible to get this case working ?
>>
>> J
>>
>> On Tue, Jul 16, 2019 at 6:27 PM Raymond Paik 
>> wrote:
>>
>>> Thanks Jarek:
>>>
>>> We do have an open source program at GitLab (
>>> https://about.gitlab.com/solutions/open-source/)  where open source
>>> projects get access to top tier features (either SaaS or self-hosted) for
>>> free including up to 50,000 CI minutes/month.
>>>
>>> Are you currently using Git as your source code repository?
>>>
>>> Ray
>>>
>>> On Tue, Jul 16, 2019 at 8:49 AM Jarek Potiuk 
>>> wrote:
>>>
>>>> Adding Raymond Paik who is GitLab Community Manager and wants to
>>> chime-in
>>>> the thread!
>>>>
>>>> J.
>>>>
>>>> On Thu, Jul 4, 2019 at 1:04 AM Allen Wittenauer
>>>>  wrote:
>>>>
>>>>>
>>>>>
>>>>>> On Jul 3, 2019, at 3:15 PM, Joan Touzet  wrote:
>>>>>>
>>>>>> I was asking if any of the service platforms provided this. So far,
>>> it
>>>>> looks like no.
>>>>>
>>>>> I was playing around bit with Drone today because we actually
>>>>> need ARM in $DAYJOB and this convo reminded me that I needed to check
>>> it
>>>>> out.
>>>>>
>>>>> So far, I’m a little underwhelmed with the feature set. (No
>>>>> built-in artifacting, no junit output processing, buggy/broken yaml
>>>>> parser,  … to be fair, they are relatively new so likely still building
>>>>> these things up) BUT! They do support gitlab and acting as a gitlab ci
>>>>> runner. So theoretically one could do linux/x86, windows/x86, mac os
>>> x, and
>>>>> linux/arm off of a combo of gitlab ci + drone.
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>>
>>>> Jarek Potiuk
>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>
>>>> M: +48 660 796 129 <+48660796129>
>>>> [image: Polidea] <https://www.polidea.com/>
>>>>
>>>>
>>>
>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>
>

Re: Github Actions

2019-08-28 Thread Joan Touzet

Continuing the top-post trend...

I'd rather see full audit logs kept ~forever for any use of credentials,
including the code that was executed.

If we can't stop the leak, we can at least keep the paper trail.

Right now, with our aggressive build cleanup steps, I don't think this
is happening. Archiving that data somewhere else for legal purposes
might be a good idea.

-Joan "just an idea" Touzet

On 2019-08-28 5:02, sebb wrote:
> I think the pre-verified code could run on a separate system with
> restricted access.
> That's how self-service works for creating mailing lists, for example.
> 
> In this case, there would need to be a separate host with read access
> to Jenkins.
> It could accept publish requests from Jenkins, and route them accordingly.
> 
> This would require a bit of effort to set up, but could be used for
> multiple projects.
> 
> On Wed, 28 Aug 2019 at 03:26, Greg Stein  wrote:
>>
>> Yeah. FIgured as much, hoped that I was missing something :)
>>
>> (note: we have the same issue with buildbot and jenkins: we simply trust
>> the communities to not exfil that data)
>>
>> On Tue, Aug 27, 2019 at 9:16 PM Matt Sicker  wrote:
>>
>>> How to avoid leaking secrets: only way to do that is via pre-verified code
>>> that executes something with that secret. Otherwise, there’s literally
>>> infinite ways to leak it being a Turing machine and all. This applies to
>>> all CICD tools.
>>>
>>> On Tue, Aug 27, 2019 at 20:32, Greg Stein  wrote:
>>>
 Hi Francis,

 Is the token needed to push from calcite to calcite-site? Is that an
>>> oauth
 token or something? And are you able to use the repository settings to
>>> add
 secrets, but you don't have the right token? Or you cannot add secrets at
 all? (I can't tell since I have superpowers)

 I've added GSTEIN_TEST_SECRET to Calcite. See if you can extract/print
>>> that
 into your build/action log. If so, then we can try to figure out the
 security here (ie. how do we avoid Actions exfiltrating the token?)

 Thanks,
 -g

 On Tue, Aug 27, 2019 at 5:19 AM Francis Chuang >>>
 wrote:

> I have implemented the ability to generate the website and javadoc for
> Calcite using Github Actions. See:
> https://github.com/apache/calcite/tree/test-site/.github/workflows
>
> The missing piece is that we need the token to publish to our
> calcite-site repository to be added as a secret in Github Actions and
> there is currently no clear process as to whether this is allowed or
>>> how
> to get this done.
>
> See:
> https://issues.apache.org/jira/browse/INFRA-18874
> https://issues.apache.org/jira/browse/INFRA-18875
>
> Francis
>
> On 27/08/2019 7:52 pm, Greg Stein wrote:
>> Have you had an opportunity to make progress on this, to share with
>>> us?
>>
>> Anybody else with news?
>>
>> Thanks!
>> -g
>> InfraAdmin, ASF
>>
>>
>> On Tue, Aug 13, 2019 at 3:59 PM Karl Heinz Marbaise <
>>> khmarba...@gmx.de
>
>> wrote:
>>
>>> Hi,
>>>
>>> I've made a simple PoC for the Apache Maven Dependency Plugin on a
>>> separate branch.
>>>
>>> I will try within the next days more features for example Mac OS
 builds
>>> etc.
>>>
>>>
>>> Currently I simply push my changes via gitbox ..
>>>
>>> maven-dependency-plugin (GITHUB_ACTIONS)$ git remote -v
>>> origin
 https://gitbox.apache.org/repos/asf/maven-dependency-plugin.git
>>> (fetch)
>>> origin
 https://gitbox.apache.org/repos/asf/maven-dependency-plugin.git
>>> (push)
>>>
>>>
>>> Also I'm interested to use SonarCloud related with GitHub Actions..?
>>>
>>>
>>> Kind regards
>>> Karl Heinz Marbaise
>>> Apache Maven PMC
>>>
>>> [1]:
>>> https://github.com/apache/maven-dependency-plugin/runs/192633340
>>> [2]:
>>>
>>>
>

>>> https://github.com/apache/maven-dependency-plugin/blob/66435b225e7885f44b25207e025469f6d5237107/.github/workflows/maven.yml
>>>
>>> On 12.08.19 00:31, Greg Stein wrote:
 On Sun, Aug 11, 2019 at 5:15 PM Francis Chuang <
> francischu...@apache.org

 wrote:
> ...

> I think there are quite a few ASF projects using gitbox and Github
 and
> this would be a very good complement or replacement for Travis,
> appvoyer
> and other CI/CD platforms currently in use.
>
> Is there any interest from the ASF to enable this for all Gitbox
> projects when it becomes fully public?
>

 Absolutely. The Infrastructure team would love to see groups try
>>> this
>>> out,
 and share the experiences here.

 If there are any hurdles, then share them and we'll try to knock
>>> them
>>> down.

 I am also interested in being able to push to our website
 automati

Re: Jenkins - Build docker image within a container

2020-01-10 Thread Joan Touzet

You may have to do a Docker sidecar running the docker daemon and pass
over the docker socket, or you *might* be able to do
docker-outside-of-docker.

Some relevant tickets:

https://github.com/fabric8io/docker-maven-plugin/issues/1005
https://github.com/fabric8io/docker-maven-plugin/issues/863

I have zero maven+docker experience so this is the most advice I can offer.

Good luck,
Joan

On 2020-01-10 5:33, Thomas Bouron wrote:
> Hi there.
> 
> I have a question regarding the current Jenkins setup which I hope some of
> you will have the answer.
> 
> Apache Brooklyn is currently built on Jenkins within a docker container.
> The pipeline is setup so it first builds an image and then use it to run
> maven + tests.
> 
> Brooklyn provides different distribution packages: tarball, deb and rpm. We
> are currently trying to add a docker image to the mix. I created a PR for
> this (https://github.com/apache/brooklyn-dist/pull/148) which update the
> `Jenkinsfile` to pass the docker socket to the docker container. But the
> build fails with the following error message:
> 
> ```
> Failed to execute goal io.fabric8:docker-maven-plugin:0.31.0:remove
> (cleanup) on project karaf-docker-image: Execution cleanup of goal
> io.fabric8:docker-maven-plugin:0.31.0:remove failed: No  given,
> no DOCKER_HOST environment variable, no read/writable
> '/var/run/docker.sock' or '//./pipe/docker_engine' and no external provider
> like Docker machine configured
> ```
> 
> Note that doing the same thing on my local machine work, but not on Apache
> Jenkins. Any idea how I can fix this? I created a Jira ticket for INFRA (
> https://issues.apache.org/jira/browse/INFRA-19523) but they told me to ask
> on this list instead, hence this mail.
> 
> Thank you.
> Best.
>

Re: broken builds taking up resources

2020-01-23 Thread Joan Touzet


On 2020-01-23 4:50, Chesnay Schepler wrote:

On 23/01/2020 10:19, Thomas Bouron wrote:

On Thu, 23 Jan 2020 at 08:56, Robert Munteanu  wrote:


On Wed, 2020-01-22 at 17:53 -0800, Chris Lambertus wrote:

Additionally, orphaned docker jobs are causing major resource
contention. I will be adding a weekly job to docker system prune —all
&& service docker restart.

+1, it's easy to get this wrong. It would be great if you could also
document some best practices when using containers for CI on the ASF
Jenkins instance.

docker system prune -af works, but is probably a big hammer, when each
job should instead clean up after itself.



+1 on this.

For Brooklyn, we are using docker container for CI and I think we do it
right (you can see the Dockerfile and Jenkinsfile in the repo for more
context). But guidelines would be great. For instance, it took a long 
time

to arrive to the particular setup we are currently running.

Back to the question of timeouts, the main Brooklyn build pipeline takes
around 1h30.

Best.


The Flink snapshot deployments take about 1h15m.



CouchDB is busy moving off the shared Jenkins workers, but our runs take 
20-30 minutes on pull requests (unless there's a problem) and about 
30-45 minutes on release branches after that. There are plans to extend 
the release branch builds with other tasks that take longer, such as 
preparing our Docker images -- these run rarely but do take upwards of 
90-120 minutes each, and cannot be split into multiple steps.


-Joan

Re: Automatically building GitHub pull requests with Jenkins

2020-06-09 Thread Joan Touzet

Try specifying your git repository as 
https://github.com/apache/guacamole-server instead of 
git://github.com/apache/guacamole-server.git ? Just a guess.


On 09/06/2020 16:45, Mike Jumper wrote:

Hello all,

I've been trying to configure Jenkins jobs to automatically build pull
requests for the Guacamole repositories on GitHub using the "CloudBees Pull
Request Builder" as documented here:

https://cwiki.apache.org/confluence/display/INFRA/Kicking+off+a+build+in+Jenkins+with+a+GitHub+PR

https://docs.cloudbees.com/docs/admin-resources/latest/plugins/pull-request-builder-for-github

So far, I am having no luck, with the current attempt producing a cryptic
IOException apparently related to git:

"... hudson.remoting.ProxyException:
hudson.remoting.FastPipedInputStream$ClosedBy: The pipe was closed at...
 at
hudson.remoting.FastPipedInputStream.close(FastPipedInputStream.java:112)
 at org.apache.commons.io.IOUtils.closeQuietly(IOUtils.java:363)
 at org.apache.commons.io.IOUtils.closeQuietly(IOUtils.java:284)
 at
com.cloudbees.jenkins.plugins.git.vmerge.ChannelTransport$GitPushTask.invoke(ChannelTransport.java:133)
 at
com.cloudbees.jenkins.plugins.git.vmerge.ChannelTransport$GitPushTask.invoke(ChannelTransport.java:117)
 at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3052)
 at hudson.remoting.UserRequest.perform(UserRequest.java:212)
 at hudson.remoting.UserRequest.perform(UserRequest.java:54)
 at hudson.remoting.Request$2.run(Request.java:369)
Caused: hudson.remoting.ProxyException: java.io.IOException: Pipe is
already closed
 at
hudson.remoting.FastPipedOutputStream.write(FastPipedOutputStream.java:154)
 at
hudson.remoting.FastPipedOutputStream.write(FastPipedOutputStream.java:138)
 at
hudson.remoting.ProxyOutputStream$Chunk$1.run(ProxyOutputStream.java:255)
 at hudson.remoting.PipeWriter$1.run(PipeWriter.java:158)
 at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at
hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)
 at
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
Caused: java.io.IOException: Pipe is already closed
 ..."

(see
https://builds.apache.org/view/E-G/view/Guacamole/job/guacamole-server-pull-request/12/console
)

Any ideas on what might be going wrong here, or any examples on this being
used in practice?

Thanks,

- Mike

Re: Controlling the images used for the builds/releases

2020-06-22 Thread Joan Touzet

Hey Jarek, thanks for starting this thread. It's a thorny issue, for 
sure, especially because binary releases are not "official" from an ASF 
perspective.


(Of course, this is a technicality; the fact that your PMC is building 
these and linking them from project pages, and/or publishing them out as 
apache/ or top-level  at Docker Hub can be seen as a 
kind of officiality. It's just, for the moment, not an Official Act of 
the Foundation for legal reasons.)


On 22/06/2020 09:52, Jarek Potiuk wrote:

Hello Everyone,

I have a kind question and request for your opinions about using external
Docker images and downloaded binaries in the official releases for Apache
Airflow.

The question is: How much can we rely on those images being available in
those particular cases:

A) during static checks
B) during unit tests
C) for building production images for Airflow
D) for releasing production Helm Chart for Airflow

Some more explanation:

For a long time we are doing A) and B) in Apache Airflow and we followed a
practice that when we found an image that is goo for us and seems "legit"
we are using it. Example -
https://hub.docker.com/r/hadolint/hadolint/dockerfile/ - HadoLint image to
check our Dockerfiles.  Since this is easy to change pretty much
immediately, and only used for building/testing, I have no problem with
this, personally and I think it saves a lot of time and effort to maintain
some of those images.


Sure. Build tools can even be GPL, and something like a linter isn't a 
hard dependency for Airflow anyway. +1



But we are just about to start releasing Production Image and Helm Chart
for Apache Airflow and I started to wonder if this is still acceptable
practice when - by releasing the code - we make our users depend on those
images.


Just checking: surely a production Airflow Docker image doesn't have 
hadolint in it?



We are going to officially support both - image and helm chart by the
community and once we release the image and helm chart officially, those
external images and downloads will become dependencies to our official
"releases". We are allowing our users to use our official Dockerfile
to build a new image (with user's configuration) and Helm Chart is going to
be officially available for anyone to install Airflow.


Sounds like a good step for your project.


The Docker images that we are using are from various sources:

1) officially maintained images (Python, KinD, Postgres, MySQL for example)
2) images released by organizations that released them for their own
purpose, but they are not "officially maintained" by those organizations
3) images released by private individuals

While 1) is perfectly OK for both image and helm chart, I think for 2) and
3) we should bring the images to Airflow community management.


I agree, and would go a step further, see below.


Here is the list of those images I found that we use:

- aneeshkj/helm-unittest
- ashb/apache-rat:0.13-1
- godatadriven/krb5-kdc-server
- polinux/stress (?)
- osixia/openldap:1.2.0
- astronomerinc/ap-statsd-exporter:0.11.0
- astronomerinc/ap-pgbouncer:1.8.1
- astronomerinc/ap-pgbouncer-exporter:0.5.0-1

Some of those images are released by organizations that are strong
stakeholders in the project (Astronomer especially). Some other images are
by organizations that are still part of the community but not as strong
stakeholders (GoDataDriven) - some others are by private individuals who
are contributors (Ash, Aneesh) and some others are not-at-all connected to
Apache Airflow (polinux, osixia).

For me quite clearly - we are ok to rely on "officially" maintained images
and we are not ok to rely on images released by individuals in this case.
But there is a range of images in-between that I have no clarity about.

So my questions are:

1) Is this acceptable to have a non-officially released image as a
dependency in released code for the ASF project?


First question: Is it the *only* way you can run Airflow? Does it end up 
in the source tarball? If so, you need to review the ASF licensing 
requirements and make sure you're not in violation there. (Just Checking!)


Second: Most of these look like *testing* dependencies, not runtime 
dependencies.



2) If it's not - how do we determine which images are "officially
maintained".

3) If yes - how do we put the boundary - when image is acceptable? Are
there any criteria we can use or/ constraints we can put on the
licences/organizations releasing the images we want to make dependencies
for released code of ours?


How hard would it be for the Airflow community to import the Dockerfiles 
and build the images themselves? And keep those imported forks up to 
date? We do this a lot in CouchDB for our dependencies (not just Docker) 
where it's a personal project of someone in the community, or even where 
it's some corporate thing that we want to be sure we don't break on when 
they implement a change for their own reasons.


Automating building thes

Re: [ci-builds] GitHub credentials

2020-07-29 Thread Joan Touzet

Infra hasn't approved these in the past. If that policy changes, I'd 
very much like to know about it.


For CouchDB we use a token on my account that I added for this purpose, 
limited to Apache repos only. Of course, these API calls count towards 
my personal limit, which affects other GitHub work that I do outside of 
the ASF.


-joan

On 29/07/2020 10:18, Andor Molnar wrote:

I’ve created a dummy Github user for ZooKeeper, it works fine in terms of 
branch scanning, but it doesn’t have permissions to update the Github Build 
status at the end of each build.

I think I should add it to the project as contributor/member, but not sure how 
to do that.

Please advise.

Andor




On 2020. Jul 27., at 13:24, Andor Molnar  wrote:

I’m interested in this one too.
Currently I’m using regular ‘Git’ source instead of GitHub to speed up branch 
discovery, but this way I cannot run builds against Pull Requests.

Andor




On 2020. Jul 23., at 21:38, Zoran Regvart  wrote:

Hi builders,
I see some questions on this but not much conclusion currently.

What credentials should we use with GitHub SCM source? Should we use
personal access tokens, or will there be an INFRA provided
Jenkins-wide credential we can use (like a GitHub App[1])?

We're now at the mercy of the GitHub API limits and as more projects
are migrated I expect that to have a big impact.

zoran

[1] 
https://docs.cloudbees.com/docs/cloudbees-ci/latest/cloud-admin-guide/github-app-auth
--
Zoran Regvart

Re: New Credentials for Github jobs

2020-08-14 Thread Joan Touzet


Could we get these on the ci-couchdb server for testing? Thanks.

-Joan

On 2020-08-14 3:37 a.m., Gavin McDonald wrote:

Hi All,

For those of you waiting for the 'asf-ci' credentials - this is still not
resolved yet, and is waiting
for Cloudbees support.

However - I have created some new credentials, based off of a GH App rather
than a role account.

Look for credentials 'ASF Cloudbees Jenkins ci-builds' and give that a try
in your jobs please and see if that works for you. Let me know how it goes
for you.

Re: Controlling the images used for the builds/releases

2020-09-13 Thread Joan Touzet

HI Jarek,

Can you comment on one specific thing? In Proposal 1 you still leave the
text "...MUST only add binary/bytecode files". This is not possible for
convenience packages in many situations - for instance OpenOffice or
other languages - where providing a full release of a product requires a
language runtime. It has always bothered me that this text effectively
prevents redistribution of binary assets in the packages that are not
strictly speaking derived from the source code.

As you go far beyond this with the container packaging in Proposal 2, I
believe Proposal 1 needs to be modified to match. In my opinion a
suitable replacement would be something like:

"In all such cases...version number as the source release, as MUST
include only the binary/bytecode files that are necessary, via the
compiling and packaging of that source code release and its
dependencies, to produce a functional deliverable. All instructions..."

-Joan

On 2020-09-13 4:40 p.m., Jarek Potiuk wrote:

Just for your information - after a discussion in the ComDev mailing list.
I created a proposal for Apache Software Foundation to introduce changes to
the "ASF release policies", to make it clear and straightforward to release
"convenience packages" in the form of "software packaging" (such as Helm
Charts and Container Images) rather than "compiled packages" as recognised
so far by the ASF policies.

The proposal is here:
https://cwiki.apache.org/confluence/display/COMDEV/Updates+of+policies+for+the+convenience+packages

The discussion in the ComDev ASF mailing list is here:
https://lists.apache.org/thread.html/r49c3ef0a8423664c564c0c2719056662021f03b5678ef5b249892c10%40%3Cdev.community.apache.org%3E

We are going to discuss it and propose to the ASF board to vote on the
changes.

I look forward to all comments and I hope it can pave the way for the ASF
to provide a coherent approach for releasing Container Images, Helm Charts
for all ASF projects.

On Mon, Aug 31, 2020 at 9:23 PM Jarek Potiuk
wrote:

Just to revive this thread and let you know what we've done in Airflow.

We merged changes to our repository that allow our users to rebuild all
images if they need to -using official sources. It's not very involved and
not a lot of code to maintain:
https://github.com/apache/airflow/pull/9650/
Next time when we release Airflow Sources including the Helm Chart, any of
our users will be able to rebuild all the images used in charts from the
ASF-released source package.

The whole discussion ended up to be not about the Licence, but about the
content of the official ASF source package release.

I personally think this is the only way to fulfill this chapter from ASF
release policy:
http://www.apache.org/legal/release-policy.html#what-must-every-release-contain

Every ASF release must contain a source package, which must be sufficient

for a user to build and test the release provided they have access to the
appropriate platform and tools.

I would love to hear other thoughts about it.

On Tue, Jun 23, 2020 at 11:42 PM Roman Shaposhnik
wrote:

On Tue, Jun 23, 2020 at 2:26 AM Jarek Potiuk
wrote:

My understanding the bigger problem is the license of the dependency

(and

their dependencies) rather than the official/unofficial status. For

Apache

Yetus' test-patch functionality, we defaulted all of our plugins to

off

because we couldn't depend upon GPL'd binaries being available or

giving

the impression that they were required. By doing so, it put the onus

the user to specifically enable features that depends upon GPL'd
functionality. It also pretty much nukes any idea of being user

friendly.

Indeed - Licensing is important, especially for source code

redistribution.

We used to have some GPL-install-on-your-own-if-you-want in the past but
those dependencies are gone already.

2) If it's not - how do we determine which images are "officially
maintained".

Keep in mind that Docker themselves brand their images as
'official' when they actually come from Docker instead of the

organizations

that own that particular piece of software. It just adds to the

complexity.

Not really. We actually plan to make our own Apache Airflow Docker

image as

official one. Docker has very clear guidelines on how to make images
"official" and it https://docs.docker.com/docker-hub/official_images/

and

there is quite a long iist of those:
https://github.com/docker-library/official-images/tree/master/library -
most of them maintained by the "authirs" of the image. Docker has a
dedicated team that reviews, checks those images and they encourage that
the "authors" maintain them. Quote from Docker's docs: "While it is
preferable to have upstream software authors maintaining their
corresponding Official Images, this is not a strict requirement."

3) If yes - how do we put the boundary - when image is acceptable?

Are

there any criteria we can use or/ constraints we can put on the
licences/

Re: Controlling the images used for the builds/releases

2020-09-13 Thread Joan Touzet

On 2020-09-13 5:19 p.m., Jarek Potiuk wrote:

Can you please make an inline comment in the document? the Cwiki allows
inline comments, just select a paragraph and comment it there. This is the
easiest way to keep it focused in the document. I am not sure if
understand the Open-Office specific things, i'd love to understand that
though (I used Open-Office for years) :)

Done, and I expanded on this point.

I think that any release of ASF software must have corresponding sources
that can be use to generate those from. Even if there are some binary
files, those too should be generated from some kind of sources or
"officially released" binaries that come from some sources. I'd love to get
some more concrete examples of where it is not possible.

Sure, this is totally possible. I'm just saying that the amount of
source is extreme in the case where you're talking about a desktop app
that runs in Java or Electron (Chrome as a desktop app), as two examples.

-Joan

On Sun, Sep 13, 2020 at 11:09 PM Joan Touzet wrote:

HI Jarek,

As you go far beyond this with the container packaging in Proposal 2, I
believe Proposal 1 needs to be modified to match. In my opinion a
suitable replacement would be something like:

-Joan

On 2020-09-13 4:40 p.m., Jarek Potiuk wrote:

Just for your information - after a discussion in the ComDev mailing

list.

I created a proposal for Apache Software Foundation to introduce changes

the "ASF release policies", to make it clear and straightforward to

release

"convenience packages" in the form of "software packaging" (such as Helm
Charts and Container Images) rather than "compiled packages" as

recognised

so far by the ASF policies.

The proposal is here:

https://cwiki.apache.org/confluence/display/COMDEV/Updates+of+policies+for+the+convenience+packages

The discussion in the ComDev ASF mailing list is here:

https://lists.apache.org/thread.html/r49c3ef0a8423664c564c0c2719056662021f03b5678ef5b249892c10%40%3Cdev.community.apache.org%3E

We are going to discuss it and propose to the ASF board to vote on the
changes.

I look forward to all comments and I hope it can pave the way for the ASF
to provide a coherent approach for releasing Container Images, Helm

Charts

for all ASF projects.

On Mon, Aug 31, 2020 at 9:23 PM Jarek Potiuk
wrote:

Just to revive this thread and let you know what we've done in Airflow.

We merged changes to our repository that allow our users to rebuild all
images if they need to -using official sources. It's not very involved

and

not a lot of code to maintain:
https://github.com/apache/airflow/pull/9650/
Next time when we release Airflow Sources including the Helm Chart, any

our users will be able to rebuild all the images used in charts from the
ASF-released source package.

The whole discussion ended up to be not about the Licence, but about the
content of the official ASF source package release.

I personally think this is the only way to fulfill this chapter from ASF
release policy:

http://www.apache.org/legal/release-policy.html#what-must-every-release-contain

Every ASF release must contain a source package, which must be

sufficient

for a user to build and test the release provided they have access to

the

appropriate platform and tools.

I would love to hear other thoughts about it.

On Tue, Jun 23, 2020 at 11:42 PM Roman Shaposhnik

wrote:

On Tue, Jun 23, 2020 at 2:26 AM Jarek Potiuk

wrote:

My understanding the bigger problem is the license of the dependency

(and

their dependencies) rather than the official/unofficial status. For

Apache

Yetus' test-patch functionality, we defaulted all of our plugins to

off

because we couldn't depend upon GPL'd binaries being available or

giving

the impression that they were required. By doing so, it put the onus

the user to specifically enable features that depends upon GPL'd
functionality. It also pretty much nukes any idea of being user

friendly.

Indeed - Licensing is important, especially for source code

redistribution.

We used to have some GPL-install-on-your-own-if-you-want in th

Re: Controlling the images used for the builds/releases

2020-09-14 Thread Joan Touzet


Hi Jarek,

I'm about to head out for 3 weeks, so I'm going to miss most of this 
discussion. I've done my best to leave comments in your document, but 
just picking out one topic in this thread:


On 14/09/2020 02:40, Jarek Potiuk wrote:

Yeah - I see the point and to be honest, that was exactly my original
intention when I wrote the proposal. I modified it slightly to reflect that
- I think now after preparing the proposal that the "gist" of it is really
to introduce two kinds of convenience packages - one is the "compiled"
package (which should be far more restricted what it contains due to
limitations of licences such as GPL) and the other is simply "packaged"
software - where we put independent software or binaries in a single
"convenience" package but it does not have as far-reaching
legal/licence consequences as compiled packages.

The criteria I proposed introduce an interesting concept - the recursive
definition of "official" packages - that was the most "difficult" part
to come up with. But I believe as long as the criteria we come up with can
be recursively applied to any binaries or reference to those binaries up to
the end of the recursive chain of dependencies and as long as we provide
instructions on how to build those binaries by the "power" users, I believe
it should be perfectly fine to include such binaries in "packaged" software
without explicitly releasing all the sources for them.

So I tried to put it in the way to make it clear that the original
limitations remain in place for the "compiled" package (effectively I am
not changing any wording in the policy regarding those) but I (hope) make
it clear that other limitations and criteria apply to "packaged" software
using those modern tools like Docker/Helm but also any form of installable
packages (like Windows installers). I've also specifically listed the
"windows installers" as an example package.


I don't like the double standard of "compiled" vs. "packaged" software. 
It's hard to understand when to apply which, and creates an un-level 
playing field. Not every ASF project can create both, and you're using a 
different ruler for each. I realize it was your intent to avoid clouding 
the water, and to apply stricter rules to one vs. the other, but I feel 
this is just continuing the double-standard I previously mentioned, 
albeit in a different form.


Good luck with the effort, and thanks for taking on this herculean task.

-Joan



J.


On Mon, Sep 14, 2020 at 2:57 AM Allen Wittenauer
 wrote:





On Sep 13, 2020, at 2:55 PM, Joan Touzet  wrote:

I think that any release of ASF software must have corresponding sources
that can be use to generate those from. Even if there are some binary
files, those too should be generated from some kind of sources or
"officially released" binaries that come from some sources. I'd love to

get

some more concrete examples of where it is not possible.


Sure, this is totally possible. I'm just saying that the amount of

source is extreme in the case where you're talking about a desktop app that
runs in Java or Electron (Chrome as a desktop app), as two examples.


... and mostly impossible when talking about Windows containers.

Re: Controlling the images used for the builds/releases

2020-09-14 Thread Joan Touzet


On 14/09/2020 11:54, Jarek Potiuk wrote:

Oh yeah. I start realizing now how herculean it is :). No worries, I am
afraid when you are back, the discussion will be just warming up :).

Speaking of the "double standard" - the main reason really comes from
licensing. When you compile something in that is GPL, your code starts to
be bound by the licence. But when you just bundle it together in a software
package - you are not.

So this is pretty much unavoidable to apply different rules to those
situations. No matter what - we have to make this distinction IMHO. But
let's see what others say on that.  I'd love to hear your thought on that,
before you head out.


Taking CouchDB, shipping *just* the compiled .beam files is possible but 
helps no one because they require the functional Erlang interpreter 
alongside them. In other words, it is not a runnable asset.


I believe you can compile Erlang against 100% non-GPL assets, but this 
is not common. How many people don't use gnulibc on Linux?


Thus, double standard, allowing access to "binary packages" only for 
those languages where the compiled asset is, on its own, sufficient to 
run the program. This is not even true for e.g. Node.JS or Python, any 
time there would be (potentially GNU) libc bindings.



J


On Mon, Sep 14, 2020 at 5:47 PM Joan Touzet  wrote:


Hi Jarek,

I'm about to head out for 3 weeks, so I'm going to miss most of this
discussion. I've done my best to leave comments in your document, but
just picking out one topic in this thread:

On 14/09/2020 02:40, Jarek Potiuk wrote:

Yeah - I see the point and to be honest, that was exactly my original
intention when I wrote the proposal. I modified it slightly to reflect

that

- I think now after preparing the proposal that the "gist" of it is

really

to introduce two kinds of convenience packages - one is the "compiled"
package (which should be far more restricted what it contains due to
limitations of licences such as GPL) and the other is simply "packaged"
software - where we put independent software or binaries in a single
"convenience" package but it does not have as far-reaching
legal/licence consequences as compiled packages.

The criteria I proposed introduce an interesting concept - the recursive
definition of "official" packages - that was the most "difficult" part
to come up with. But I believe as long as the criteria we come up with

can

be recursively applied to any binaries or reference to those binaries up

to

the end of the recursive chain of dependencies and as long as we provide
instructions on how to build those binaries by the "power" users, I

believe

it should be perfectly fine to include such binaries in "packaged"

software

without explicitly releasing all the sources for them.

So I tried to put it in the way to make it clear that the original
limitations remain in place for the "compiled" package (effectively I am
not changing any wording in the policy regarding those) but I (hope) make
it clear that other limitations and criteria apply to "packaged" software
using those modern tools like Docker/Helm but also any form of

installable

packages (like Windows installers). I've also specifically listed the
"windows installers" as an example package.


I don't like the double standard of "compiled" vs. "packaged" software.
It's hard to understand when to apply which, and creates an un-level
playing field. Not every ASF project can create both, and you're using a
different ruler for each. I realize it was your intent to avoid clouding
the water, and to apply stricter rules to one vs. the other, but I feel
this is just continuing the double-standard I previously mentioned,
albeit in a different form.

Good luck with the effort, and thanks for taking on this herculean task.

-Joan



J.


On Mon, Sep 14, 2020 at 2:57 AM Allen Wittenauer
 wrote:





On Sep 13, 2020, at 2:55 PM, Joan Touzet  wrote:

I think that any release of ASF software must have corresponding

sources

that can be use to generate those from. Even if there are some binary
files, those too should be generated from some kind of sources or
"officially released" binaries that come from some sources. I'd love

to

get

some more concrete examples of where it is not possible.


Sure, this is totally possible. I'm just saying that the amount of

source is extreme in the case where you're talking about a desktop app

that

runs in Java or Electron (Chrome as a desktop app), as two examples.


... and mostly impossible when talking about Windows containers.

Re: GitHub PR comment build trigger

2020-10-12 Thread Joan Touzet

And to add to this, with the Blue Ocean UI for Multibranch Pipeline, it 
is a single click to rebuild a build. It's not as friendly as 
commenting, but it's a single button on the results view for your build, 
which is linked right from the PR.


Of course, this is limited to only people who have Jenkins accounts, 
which is all committers to our repo.


-Joan

On 2020-10-12 11:10 p.m., Christopher wrote:

Hi Andor,

I'm not sure if INFRA is going to enable that plugin, but I thought
I'd suggest some alternatives if they don't:

In Accumulo, we set up a "PR Builder" job in Jenkins, that we can
manually trigger. It is a parameterized build that takes two
parameters: PR and PR_Variant.
The PR is the PR number, and the variant is either "head" or "merge".
The branch specifier to check out is:
refs/remotes/origin/pr/${PR}/${PR_variant}
The refspec to fetch from the repository configuration looks like:
+refs/pull/*:refs/remotes/origin/pr/* (you have to click on Advanced
to see this option)
We also use the "Set Build Name" option to: PR #${PR} (Build #${BUILD_NUMBER})
And include a "Pre-Build step" to upload the build description to:
Pull Request #${PR} - ${GIT_COMMIT}

This works well for us. It may work for you also. The only thing is
you have to go to Jenkins to trigger the build manually.

We also use GitHub Actions, which is probably even easier to build,
because GitHub has a "rebuild jobs" option, right in the interface (to
work around transient build problems), and you can configure some
manually triggered jobs as well. We have several that might be useful
examples at: https://github.com/apache/accumulo/tree/main/.github/workflows/

I hope this helps somebody, if not you,

Christopher

On Mon, Oct 12, 2020 at 8:53 AM Andor Molnar  wrote:


Hi,

Sorry if the topic is redundant, I haven’t been following builds@ list for a 
while and couldn’t find the archives online.

Is there already a way to configure GitHub PR comment to trigger build in the 
new Jenkins instance?

I think it was the ‘GitHub PR Comment Build’ plugin in the old instance.
https://plugins.jenkins.io/github-pr-comment-build

Thanks,
Andor

Docker rate limits likely spell DOOM for any Apache project CI workflow relying on Docker Hub

2020-10-28 Thread Joan Touzet

Got your attention?

Here's what arrived in my inbox around 4 hours ago:

> You are receiving this email because of a policy change to Docker products 
> and services you use. On Monday, November 2, 2020 at 9am Pacific Standard 
> Time, Docker will begin enforcing rate limits on container pulls for 
> Anonymous and Free users. Anonymous (unauthenticated) users will be limited 
> to 100 container image pulls every six hours, and Free (authenticated) users 
> will be limited to 200 container image pulls every six hours, when 
> enforcement is fully implemented. 

Their referenced blog posts are here:

https://www.docker.com/blog/scaling-docker-to-serve-millions-more-developers-network-egress/

https://www.docker.com/blog/understanding-inner-loop-development-and-pull-rates/

Since I haven't seen this discussed on the builds list yet (and I'm not
subscribed to users@infra), I wanted to make clear the impact. I would
bet that just about every workflow using Jenkins, buildbot, GHA or
otherwise uses uncredential-ed `docker pull` commands. If you're using
the shared Apache CI workers, every pull you're making is counting
towards this 100 pulls/6 hour limit. Multiply that by every ASF project
on those servers, and multiply that again by the total number of PRs /
change requetss / builds per project, and :(

Apache's going to hit these new limits real fast. And we must act fast
to avoid problems, as those new limits kick in **MONDAY**.

Even for those of us lucky enough to have sponsorship for dedicated CI
workers, it's still a problem. Infra has scripts to wipe all
not-currently-in-use Docker containers off of each machine every 24
hours (or did, last I looked). That means you can't rely on local
caching. Other projects may also have added --force to their `docker
pull` requests in their CI workflows, to work around issues with cached,
corrupted downloads (a big problem for us on the shared CI
infrastructure), or to work around issues with the :latest tag caching
when it shouldn't.

This extends beyond projects using CI in the way Docker outlines on
their second blog post linked above, namely their encouragement to use
multi-stage builds. If local caching can't be relied on, there's no
advantage. If what's being pulled down is an image containing that
project's full build environment - this is what CouchDB does and I
expect others do as well, as setting up our build environment, even
automated, takes 30-45 minutes - frequent changes to the build
dependencies require frequent pulls of those images, which cannot be
mitigated via the Docker-recommended multi-stage builds.

=

Proposed solutions:

1. Infra provides credentialed logins through the Docker Hub apache
organisation to projects. Every project would have to update their
Jenkins/buildbot/GHA/etc workflows to consume and use these credentials
for every `docker pull` command. This depends on Apache actually being
exempted for the new limits (I'm not sure, are we?) and those creds
being distributed widely...which may run into Infra Policy issues.

2. Infra provides their own Docker registry. Projects that need images
can host them there. These will be automatically exempt. Infra will have
to plan for sufficient storage (this will get big *fast*) and bandwidth
(same). They will also have to firewall it off from non-Apache projects.

This should be configured as a pull through caching registry, so that
attempts to `docker pull docker.apache.org/ubuntu:latest` will
automatically reach out to hub.docker.com and store that image locally.
Infra can populate this registry with credentials within the ASF Docker
Hub org that are, hopefully, exempt from these requirements.

3. Like #2, but per-project, on Infra-provided VMs. Today this is not
practical, as the standard Infra-provided VM only has ~20GB of local
storage. Just a handful of Docker images will eat that space nearly
immediately.

===

I think #2 above is the most logical and expedient, but it requires a
commitment from Infra to make happen - and to get the message out - with
only 4 days until DOOM.

What does the list think? More importantly, what does Infra think?

-Joan "I'm gonna sing The Doom Song now!" Touzet

Re: Docker rate limits likely spell DOOM for any Apache project CI workflow relying on Docker Hub

2020-10-29 Thread Joan Touzet


On 2020-10-29 11:37 a.m., Allen Wittenauer wrote:




On Oct 28, 2020, at 11:57 PM, Chris Lambertus  wrote:

Infra would LOVE a smarter way to clean the cache. We have to use a heavy 
hammer because there are 300+ projects that want a piece of it, and who don’t 
clean up.. We are not build engineers, so we rely on the community to advise us 
in dealing with the challenges we face. I would be very happy to work with you 
on tooling to improve the cleanup if it improves the experience for all 
projects.


I'll work on YETUS-1063 so that things make more sense.  But in short, Yetus' 
"docker-cleanup --sentinel" will  purge container images if they are older than 
a week, then kill stuck containers after 24 hours. That order prevents running jobs from 
getting into trouble.  But it also means that in some cases it doesn't look very clean 
until two or three days later.  But that's ok: it is important to remember that an empty 
cache is a useless cache.  Those values came from experiences with Hadoop and HBase, but 
we can certainly add some way to tune them.  Oh, and unlike the docker tools, it pretty 
much ignores labels.  It does _not_ do anything with volumes, probably something we need 
to add.



(Sidebar about the script's details)

I tried to read the shell script, but I'm not in the headspace to fully 
parse it at the moment. If I'm understanding correctly, this will still 
catch CouchDB's CI docker images if they haven't changed in a week, 
which happens often enough, negating the cache.


As a project, we're kind of stuck between a rock and a hard place. We 
want to force a docker pull on the base CI image if it's out of date or 
the image is corrupted. Otherwise we want to cache forever, not just for 
a week. I can probably manage the "do we need to re-pull?" bit with some 
clever CI scripting (check for the latest image hash locally, validate 
the local image, pull if either fails) but I don't understand how the 
script resolves the latter.


Can a exemption list be passed to the script so that images matching a 
certain regex are excluded? You say the script ignores labels entirely, 
so perhaps not...


-Joan

Re: Docker rate limits likely spell DOOM for any Apache project CI workflow relying on Docker Hub

2020-11-02 Thread Joan Touzet


Hey Gavin,

To avoid the rate limiting, this means that we need to bake CI 
credentials into jobs for accounts inside of the apache org. Those 
credentials need to be used for all `docker pull` commands.


How can we do this in a way that complies with ASF Infra policy?

Thanks,
Joan "the battle wages on / for Toy Soldiers" Touzet


On 2020-11-02 4:57 a.m., Gavin McDonald wrote:

Hi All,

Any project under the 'apache' org on DockerHub are not affected by the
restrictions.

Kind Regards

Gavin "The futures so bright you gotta wear shades" McDonald


On Thu, Oct 29, 2020 at 11:08 PM Gavin McDonald 
wrote:


Hi ,

Just to note I have emailed DockerHub, asking for clarification on our
account and what our benefits are.


On Thu, Oct 29, 2020 at 6:34 PM Allen Wittenauer
 wrote:




On Oct 29, 2020, at 9:21 AM, Joan Touzet  wrote:

(Sidebar about the script's details)


 Sure.


I tried to read the shell script, but I'm not in the headspace to fully

parse it at the moment. If I'm understanding correctly, this will still
catch CouchDB's CI docker images if they haven't changed in a week, which
happens often enough, negating the cache.

 Correct. We actually tried something similar for a while and
discovered that in a lot of cases, upstream packages would disappear (or
worse, have security problems) thus making it look the image is still
"good" when it's not.  So a rebuild weekly at least guarantees some level
of "yup, still good" without having too much of a negative impact.


As a project, we're kind of stuck between a rock and a hard place. We

want to force a docker pull on the base CI image if it's out of date or the
image is corrupted. Otherwise we want to cache forever, not just for a
week. I can probably manage the "do we need to re-pull?" bit with some
clever CI scripting (check for the latest image hash locally, validate the
local image, pull if either fails) but I don't understand how the script
resolves the latter.

 Most projects that use Yetus for their actual CI testing build
the image used for the CI as part of the CI.  It is a multi-stage,
multi-file docker build that has each run use a 'base' Dockerfile (provided
by the project) that rarely changed and a per-run file that Yetus generates
on the fly, with both images tagged by either git sha or branch (depending
upon context). Due to how docker image reference counts on the layers work,
this makes the docker images effectively used as a "rolling cache" and
(beyond a potential weekly cache removal) full builds are rare.. thus
making them relatively cheap (typically <1m runtime) unless the base image
had a change far up the chain (so structure wisely).  Of course, this also
tests the actual image of the CI build as part of the CI.  (What tests the
testers? philosophy)   Given that Jenkins tries really hard to have job
affinity, re-runs were still cheap after the initial one. [Ofc, now that
the cache is getting nuked every day]

 Actually, looking at some of the ci-hadoop jobs, it looks like
yetus is managing the cache on them.  I'm seeing individual run containers
from days ago at least.  So that's a good sign.


Can a exemption list be passed to the script so that images matching a

certain regex are excluded? You say the script ignores labels entirely, so
perhaps not...

 Patches accepted. ;)

 FWIW, I've been testing on my local machine for unrelated reasons
and I keep blowing away running containers I care about so I might end up
adding it myself.  That said: the code was specifically built for CI
systems where the expectation should be that nothing is permanent.




--

*Gavin McDonald*
Systems Administrator
ASF Infrastructure Team

Re: Bintray Deprecation

2021-03-28 Thread Joan Touzet

Thanks Gavin. I know this couldn't have been an easy task. Looking
forward to the info.

On 28/03/2021 11:00, Gavin McDonald wrote:
> Hi All,
> 
> As advertised in [1] Bintray is closing down.
> 
> We have secured a replacement for bintray, which is JFrogs more extensive
> software offering 'Artifactory'.
> 
> I have been busy these last few days and have today finally completed
> migration of all packages on Bintray over to apache.jfrog.io
> 
> There is still a bit of work to do, I  am currently working on LDAP
> integration and should be ready over the next few days.
> To prevent divergence, I will be turning off upload access to Bintray
> tomorrow. (Downloads , as per the blog post, get turned off May 1st - but
> we can start serving from the new location now.
> 
> If anyone needs access to Artifactory for some reason before I get LDAP
> enabled please let me know.
> 
> In addition, for anyone that wants to attend, I will be organising a builds@
> meeting with a guest JFrog tech Guru to go over some features of
> Artifactory - our old Bintray service is by comparison only 5% of what
> Artifactory can do (my estimate).
> 
> 
> 
> [1] -
> https://jfrog.com/blog/into-the-sunset-bintray-jcenter-gocenter-and-chartcenter/
> 
>

Re: Better stability with docker authenticated jenkins agents

2021-04-06 Thread Joan Touzet

Hi Mick,

On 06/04/2021 06:34, Mick Semb Wever wrote:
> tl;dr
> Can and should all jenkins agents be (automatically) docker authenticated,
> for improved stability around docker commands?
> 
> 
> This past week the ci-cassandra.apache.org CI fell over because a fair
> percentage of docker pulls failed. Our pipeline runs a lot of docker
> containers. In the past week the number of containers run went from ~180 to
> ~270 and that pushed something over the edge. All of a sudden all docker
> commands had a significant chance of failing, that was high enough to
> ensure every pipeline was guaranteed to fail. These failures came in a few
> different forms, the list can be read in INFRA-21666. Googling them shows
> that this is a known problem, sometimes around firewalls, networks, dns,
> etc. Our jenkins agents are donated by a handful of different companies and
> are located in various different places, so such issues don't make much
> sense. The other typical fix reported was to just run docker authenticated,
> i.e. `docker login`. Trying this immediately solved all problems on
> ci-cassandra.apache.org. This was done with a temporary (and empty)
> dockerhub account, that each agent has manually logged in with. Based on
> all this, the request has been made for an official apache CI
> dockerhub account and to have jenkins agents automatically logged in, with
> credentials stored in an appropriate manner.
> 
> Has anyone experience with such issues before?

Yes. See https://issues.apache.org/jira/browse/INFRA-20795 for detail.

In short, once Infra agrees to create the images for us, we'll move all
our CI dependencies into those containers, and should no longer have issues.

> Is this a sound and reasonable request to ask of Infra?

This could work too but *only* when Infra manages the agents. For most
of our agents, Infra does not have ssh access (due to NAT-style
proxying) so there's nothing they can do... _other_ than give us those
dockerhub creds. That sounds more unwieldy than the approach outlined in
ticket 20795.

-Joan

Re: Better stability with docker authenticated jenkins agents

2021-04-06 Thread Joan Touzet

On 06/04/2021 18:09, Mick Semb Wever wrote:
>>
>>> Has anyone experience with such issues before?
>>
>> Yes. See https://issues.apache.org/jira/browse/INFRA-20795 for detail.
>>
>> In short, once Infra agrees to create the images for us, we'll move all
>> our CI dependencies into those containers, and should no longer have
>> issues.
>>
> 
> 
> Thanks Joan. We do have our testing images already in the apache docker
> account. And the unauthenticated docker commands were falling over even
> pulling just official ubuntu images.

My understanding is that pulls of all images from the apache/* namespace
are not subject to rate limiting. Thus, the recommendation to move
everything you need inside of it.

If that's not practical, or you're building images that require other
assets... this won't work for you.

-Joan

Re: Bintray Deprecation

2021-04-08 Thread Joan Touzet

Hi Gavin, any updates? We need to do a release and get our
documentation/external references updated. I tried pinging you on IRC
twice, but no response.

-Joan

On 28/03/2021 12:49, Joan Touzet wrote:
> Thanks Gavin. I know this couldn't have been an easy task. Looking
> forward to the info.
> 
> On 28/03/2021 11:00, Gavin McDonald wrote:
>> Hi All,
>>
>> As advertised in [1] Bintray is closing down.
>>
>> We have secured a replacement for bintray, which is JFrogs more extensive
>> software offering 'Artifactory'.
>>
>> I have been busy these last few days and have today finally completed
>> migration of all packages on Bintray over to apache.jfrog.io
>>
>> There is still a bit of work to do, I  am currently working on LDAP
>> integration and should be ready over the next few days.
>> To prevent divergence, I will be turning off upload access to Bintray
>> tomorrow. (Downloads , as per the blog post, get turned off May 1st - but
>> we can start serving from the new location now.
>>
>> If anyone needs access to Artifactory for some reason before I get LDAP
>> enabled please let me know.
>>
>> In addition, for anyone that wants to attend, I will be organising a builds@
>> meeting with a guest JFrog tech Guru to go over some features of
>> Artifactory - our old Bintray service is by comparison only 5% of what
>> Artifactory can do (my estimate).
>>
>>
>>
>> [1] -
>> https://jfrog.com/blog/into-the-sunset-bintray-jcenter-gocenter-and-chartcenter/
>>
>>

Re: Better stability with docker authenticated jenkins agents

2021-04-14 Thread Joan Touzet

On 07/04/2021 07:43, Mick Semb Wever wrote:
>>
>> My understanding is that pulls of all images from the apache/* namespace
>> are not subject to rate limiting. Thus, the recommendation to move
>> everything you need inside of it.
>>
> As >95% of our CI docker commands are pulls from apache/ images,
> if rate-limiting is the cause of this (note that nowhere did we see the
> toomanyrequests response error),
> then we still need to authenticate docker to get the rate-limiting
> exception for those apache/ images,
> as you mention here¹. Has that changed?

My mistake, I mis-remembered why we wanted this change. Moving images to
the apache organisation only prevents them from being auto-deleted by
Docker Hub's scrubbing process, which started in the middle of last year.

Your proposal to have a Docker Hub account seems reasonable, but since
this is something projects can solve themselves, not critical path.

We'll probably store the Docker Hub creds in Jenkins, then reference
that in the build, which is what we do for other creds we need today. As
we use declarative pipeline, that's something like:

  docker {
image "${DOCKER_IMAGE}"
label 'docker'
args "${DOCKER_ARGS}"
registryCredentialsId "${DOCKER_CREDS"}
  }

Then it is easy to use our own Jenkins-stored creds
(https://www.jenkins.io/doc/book/using/using-credentials/), or Infra can
give us a pair to use instead.

> Maybe I'm circling, but doesn't this then support the need that we should
> have jenkins agents docker authenticated somehow?
> 
> [1]
> https://lists.apache.org/thread.html/rede9074dd499ae10dcb501dedcdec43fe9cbb5c646a2c38b19946f85%40%3Cbuilds.apache.org%3E

Re: Better stability with docker authenticated jenkins agents

2021-04-14 Thread Joan Touzet

On 14/04/2021 13:05, Mick Semb Wever wrote:
>> Then it is easy to use our own Jenkins-stored creds
>> (https://urldefense.proofpoint.com/v2/url?u=https-3A__www.jenkins.io_doc_book_using_using-2Dcredentials_&d=DwIFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=42Z7FyMoAS1DbvgKNjU8zxi7xTPVAGalPzk7bfmRVgw&m=Oa7qm3rmqlXgK8wUUTEnADpIgZT18tE6W-ghG1QUspU&s=vRKVrs-iZa60eTjfvdc1aggzj_-CyObd2xs70PdN1rs&e=
>>  ), or Infra can
>> give us a pair to use instead.
>>
> 
> 
> Ahh! Do we have rights to store credentials in the ASF jenkins? I was
> under the impression INFRA had to do this for us.

Yes, you can, especially for Cassandra where you have your own separate
build master.

Projects are on the honour system not to use creds they're not supposed
to have access to, which is at least one major reason why CouchDB
requested and received their own CloudBees/Jenkins build master. The
separate build master structure should prevent anyone outside of your
PMC-approved group from creating any jobs on your build server that
might use those credentials inappropriately.

-Joan

Re: packer and vagrant and virtualbox

2022-03-10 Thread Joan Touzet

Try something like:

https://gist.github.com/mak3r/3f05c9d4f6f46d24d99bcfa4ac33

I think you need vagrant.

On 2022-03-10 3:40 a.m., Gavin McDonald wrote:
> Hi All,
> 
> docs are failing me.
> 
> I have packer currently outputting an aws type .box file
> 
> I would like to have that opened in Virtualbox but cant
> seem to find the right combination of config/command
> 
> Does anyone want to assist with this?
> 
> TIA
>

Re: packer and vagrant and virtualbox

2022-03-10 Thread Joan Touzet

Supposedly it can be done with a packer post-processor, but I've no 
experience in that. (I've previously done full AMI exports into .box 
files, there are walkthroughs online).


I'm guessing you tried:

https://learn.hashicorp.com/tutorials/packer/aws-get-started-post-processors-vagrant

Does their example not work?


On 10/03/2022 04:38, Gavin McDonald wrote:

Hi Joan.

On Thu, Mar 10, 2022 at 10:01 AM Joan Touzet  wrote:


Try something like:

https://gist.github.com/mak3r/3f05c9d4f6f46d24d99bcfa4ac33

I think you need vagrant.



yeah I have that.

What I am doing is creating an AMI that uploads to AWS - this all works,
what I want it to do next is produce a file(s) that I can open locally so I
can test that AMI before deployment.





On 2022-03-10 3:40 a.m., Gavin McDonald wrote:

Hi All,

docs are failing me.

I have packer currently outputting an aws type .box file

I would like to have that opened in Virtualbox but cant
seem to find the right combination of config/command

Does anyone want to assist with this?

TIA

54 matches

Mail list logo