Hey all,

Packaging is a work in progress at the moment. Wido, me and others are working 
on this, but we are not there yet. Partially this is because there are a 
multitude of things to consider when we discuss packaging. Hence this mail to 
share a lot of the thoughts that went into packaging.

First of all is how we look at the packages. With the ASF it is pretty clear 
that the "release" is the source code. We tag a particular state of the source 
tree and voila, we have a release. So "packaging" our "release" is a simple as 
making at tarball or zip of the code and making it available to users (aside 
from the ASF process to name something a release).
Compiling the code, as part of our build procedure generates artifacts (using 
the mvn package target). Artifacts in this sense are jar, war and zip files 
containing a mix of compiled java classes, scripts, documentation (and 
optionally dependencies). Most of these artifacts are already automagically 
sent to the apache snapshot repositories by jenkins because we inherit the 
maven configuration from the apache master (or as a final release if we wanted 
to).
Finally "packaging" is taking the artifacts generated by the compile/build step 
and turning it into some kind of OS specific package like an RPM or DEB.

The current build system based around maven is designed for this way of 
working. Maven is only taking responsibility for turning the source code into 
artifacts and additional scripts can take those artifacts and combine them into 
packages. This is explicitly done this way to not clutter a generic and 
multiplatform java build process with very os specific packaging steps. The 
packaging step should be something like, download sources as tarball, extract, 
run mvn package (with any additional profiles you need), generate package from 
the artifacts. In the packaging directory there is a centos63 spec file which 
does this (see %build and %install sections)

Currently we have released the 4.0.0 version as a source release and some 
members have graciously offered to compile, build and package the source and 
make that available to the community for their convenience. I think the 
explicit wish is that we are able to provide a distribution ourselves (meaning 
the Apache CloudStack community) instead of just providing source code. This 
brings us back to the original discussion regarding licensing and how to deal 
with dependencies in packages.

Wido and myself had a lengthy discussion on this subject at ApacheConEU and we 
haven't reached a conclusion yet, but a good enough understanding of the 
problem that we can bring our ideas and discussions to the list.

One of the main arguments is how to deal with dependencies in packages. There 
are a couple of options here:

*         Create self-contained artifacts in step 2 (compile/build) that 
contain all required dependencies. Packages made from these artifacts need no 
outside dependencies save the bare essentials like java runtime, tomcat and 
python.

*         Create "bare" artifacts with just our compiled code in step 2 
(compile/build). Packages need to include all dependencies for all jars that 
are generated as part of our code.

*         Hybrid mix and match of the two options above.

*         The old waf build is even worse. We compile bare artifacts in step 2 
(build/compile) and then package downloaded dependencies as far as the ASF 
permits and let the package dependencies deal with the others (like 
mysql-connector-java).

In my view one of the big issues here is version management of our 
dependencies. If we want to ship packages for certain operating systems we need 
to align our dependency versions to exactly the versions shipped by those 
distributions. I know that usually a higher or lower version of a java 
dependency will just work, but history has proven that you should not rely on 
this. If we have tested our version of CloudStack with version 5.1.12, should 
we ship a package with a dependency on that specific version of the 
mysql-connector-java, or hope for the best and allow any version. (leading the 
witness I know.)  We also have quite some dependencies on ancient libraries (> 
3 years old) that might not be available as packages at all. Some dependencies 
might not even have packages at all or at least not in the distributions repo.

Next question is where to put stuff we package. We toss a lot of things in 
/usr/share/java for example, it some cases with the cloud- prefix so they don't 
clash with system packages. Is this the right place/method to put our 
dependencies?

My current take on all this is to go for the completely self-contained option, 
meaning we package everything including dependencies into our artifacts and 
completely ignore any system dependencies. This is motivated partially by my 
idea that we should leave packaging into OS specific packages to the 
distributions anyway, that way we don't have to deal with startup scripts, 
configuration scripts and the whole bunch of distribution specific locations. 
For me this give the advantage of not having to worry about specific versions 
of system dependencies (and version conflicts with other java applications 
installed via packages) and I don't have to deal with managing package 
dependencies in our packaging code. However, as Chip pointed out today, these 
packages could not be shipped from ASF infra as some of the binaries have 
incompatible licenses (like mysql-connector-java). I know Wido is far more in 
favor of packaging only the bare artifacts and leaving dependencies to the 
distributions as much as possible, but he agrees with me that version 
management of specific dependencies is tricky.

I think it's a bridge to far now to deal with the packaging and license issues 
for binary packages and their dependencies. I'd rather focus the effort on 
generating a clear set of artifacts in a very clear directory structure that 
packagers can use to generate their packages. I'm also not against of shipping 
spec files and the debian build files, it would be great if somebody could just 
do "rpmbuild -tb cloudstack-4.1.0.tar.gz' and have a set of packages after a 
few minutes, but very limited to just one or two specific versions of 
distributions (or even just generate a generic rpm and dep that just installs 
our files and leaves specific stuff to the sysadmin)

So what do you guys think?

Cheers,

Hugo




Reply via email to