On 11/21/2012 01:18 AM, Hugo Trippaers wrote:
Hey all,

Packaging is a work in progress at the moment. Wido, me and others are working 
on this, but we are not there yet. Partially this is because there are a 
multitude of things to consider when we discuss packaging. Hence this mail to 
share a lot of the thoughts that went into packaging.


Indeed. Like I mentioned I'm working on this. I'm on Bali right now and mostly doing the stuff locally on my laptop in the evenings. I won't be pushing that much since internet here is crappy and I'm also on vacation :)

First of all is how we look at the packages. With the ASF it is pretty clear that the "release" is 
the source code. We tag a particular state of the source tree and voila, we have a release. So 
"packaging" our "release" is a simple as making at tarball or zip of the code and making 
it available to users (aside from the ASF process to name something a release).
Compiling the code, as part of our build procedure generates artifacts (using 
the mvn package target). Artifacts in this sense are jar, war and zip files 
containing a mix of compiled java classes, scripts, documentation (and 
optionally dependencies). Most of these artifacts are already automagically 
sent to the apache snapshot repositories by jenkins because we inherit the 
maven configuration from the apache master (or as a final release if we wanted 
to).
Finally "packaging" is taking the artifacts generated by the compile/build step 
and turning it into some kind of OS specific package like an RPM or DEB.


I agree. I just pushed the branch "packaging" with this commit: https://git-wip-us.apache.org/repos/asf?p=incubator-cloudstack.git;a=commit;h=91646a77c0a5b373db9afeafc2407d5893f0cca6

By running "dpkg-buildpackages" you get Debian packages. I haven't verified if they actually work yet.

Please read the commit message to get a better understanding of my ideas behind it.

The current build system based around maven is designed for this way of 
working. Maven is only taking responsibility for turning the source code into 
artifacts and additional scripts can take those artifacts and combine them into 
packages. This is explicitly done this way to not clutter a generic and 
multiplatform java build process with very os specific packaging steps. The 
packaging step should be something like, download sources as tarball, extract, 
run mvn package (with any additional profiles you need), generate package from 
the artifacts. In the packaging directory there is a centos63 spec file which 
does this (see %build and %install sections)

Currently we have released the 4.0.0 version as a source release and some 
members have graciously offered to compile, build and package the source and 
make that available to the community for their convenience. I think the 
explicit wish is that we are able to provide a distribution ourselves (meaning 
the Apache CloudStack community) instead of just providing source code. This 
brings us back to the original discussion regarding licensing and how to deal 
with dependencies in packages.

Wido and myself had a lengthy discussion on this subject at ApacheConEU and we 
haven't reached a conclusion yet, but a good enough understanding of the 
problem that we can bring our ideas and discussions to the list.

One of the main arguments is how to deal with dependencies in packages. There 
are a couple of options here:

*         Create self-contained artifacts in step 2 (compile/build) that 
contain all required dependencies. Packages made from these artifacts need no 
outside dependencies save the bare essentials like java runtime, tomcat and 
python.

*         Create "bare" artifacts with just our compiled code in step 2 
(compile/build). Packages need to include all dependencies for all jars that are 
generated as part of our code.

*         Hybrid mix and match of the two options above.

*         The old waf build is even worse. We compile bare artifacts in step 2 
(build/compile) and then package downloaded dependencies as far as the ASF 
permits and let the package dependencies deal with the others (like 
mysql-connector-java).

In my view one of the big issues here is version management of our dependencies. 
If we want to ship packages for certain operating systems we need to align our 
dependency versions to exactly the versions shipped by those distributions. I know 
that usually a higher or lower version of a java dependency will just work, but 
history has proven that you should not rely on this. If we have tested our version 
of CloudStack with version 5.1.12, should we ship a package with a dependency on 
that specific version of the mysql-connector-java, or hope for the best and allow 
any version. (leading the witness I know.)  We also have quite some dependencies 
on ancient libraries (> 3 years old) that might not be available as packages at 
all. Some dependencies might not even have packages at all or at least not in the 
distributions repo.

Next question is where to put stuff we package. We toss a lot of things in 
/usr/share/java for example, it some cases with the cloud- prefix so they don't 
clash with system packages. Is this the right place/method to put our 
dependencies?

My current take on all this is to go for the completely self-contained option, 
meaning we package everything including dependencies into our artifacts and 
completely ignore any system dependencies. This is motivated partially by my 
idea that we should leave packaging into OS specific packages to the 
distributions anyway, that way we don't have to deal with startup scripts, 
configuration scripts and the whole bunch of distribution specific locations. 
For me this give the advantage of not having to worry about specific versions 
of system dependencies (and version conflicts with other java applications 
installed via packages) and I don't have to deal with managing package 
dependencies in our packaging code. However, as Chip pointed out today, these 
packages could not be shipped from ASF infra as some of the binaries have 
incompatible licenses (like mysql-connector-java). I know Wido is far more in 
favor of packaging only the bare artifacts and leaving dependencies to the distr
ibutions as much as possible, but he agrees with me that version management of specific dependencies is tricky.

Yes, I was in favor of having minimal/bare artifacts, but since I've been playing with the self-contained artifacts I think my opinion has changed.

The reason for this is that we indeed don't have to worry about what version of a library a specific distribution is shipping in their repo.

Right now we "support" RHEL 6.3 and Ubuntu 12.04, but if we want to expand that support we have to do a lot of research and work to actually make it work and keep it working.

It's already hard enough to figure out the KVM/Qemu/libvirt dependencies for other distributions.

As long as we bundle all the dependencies in one JAR/WAR I don't have a problem with it. Yes, those files can become pretty big, right now the agent is 12M and the server is 28M.

That's however not a problem when you are running your hypervisors on a USB-stick or so.

Our JAR/WAR can be cleanly installed by a package manager and de-installed again.

This way the KVM agent package only has:
* init script
* base configuration
* JAR file
* log directory

No dependencies on other Java packages or so, just that.

We don't have to worry about prefixing stuff with cloud-* or so to prevent naming conflicts.

Yes, in the ideal world I'd like to see our dependencies come from the distribution's repository, but we don't live in the ideal world.

We want to keep supporting Ubuntu 12.04 at least until April 2014 when the next LTS comes out. Get my point? That means that for at least two years we have to figure out Java dependencies. I'd rather not.

It will already be hard enough to keep working with a libvirt which is over two years old in 2014.

Same goes ofcourse for RHEL, but I'm a Ubuntu guy :-)

Wido


I think it's a bridge to far now to deal with the packaging and license issues for 
binary packages and their dependencies. I'd rather focus the effort on generating a 
clear set of artifacts in a very clear directory structure that packagers can use to 
generate their packages. I'm also not against of shipping spec files and the debian 
build files, it would be great if somebody could just do "rpmbuild -tb 
cloudstack-4.1.0.tar.gz' and have a set of packages after a few minutes, but very 
limited to just one or two specific versions of distributions (or even just generate 
a generic rpm and dep that just installs our files and leaves specific stuff to the 
sysadmin)

So what do you guys think?

Cheers,

Hugo





Reply via email to