On 11/21/2012 01:18 AM, Hugo Trippaers wrote:
Hey all,
Packaging is a work in progress at the moment. Wido, me and others are working
on this, but we are not there yet. Partially this is because there are a
multitude of things to consider when we discuss packaging. Hence this mail to
share a lot of the thoughts that went into packaging.
Indeed. Like I mentioned I'm working on this. I'm on Bali right now and
mostly doing the stuff locally on my laptop in the evenings. I won't be
pushing that much since internet here is crappy and I'm also on vacation :)
First of all is how we look at the packages. With the ASF it is pretty clear that the "release" is
the source code. We tag a particular state of the source tree and voila, we have a release. So
"packaging" our "release" is a simple as making at tarball or zip of the code and making
it available to users (aside from the ASF process to name something a release).
Compiling the code, as part of our build procedure generates artifacts (using
the mvn package target). Artifacts in this sense are jar, war and zip files
containing a mix of compiled java classes, scripts, documentation (and
optionally dependencies). Most of these artifacts are already automagically
sent to the apache snapshot repositories by jenkins because we inherit the
maven configuration from the apache master (or as a final release if we wanted
to).
Finally "packaging" is taking the artifacts generated by the compile/build step
and turning it into some kind of OS specific package like an RPM or DEB.
I agree. I just pushed the branch "packaging" with this commit:
https://git-wip-us.apache.org/repos/asf?p=incubator-cloudstack.git;a=commit;h=91646a77c0a5b373db9afeafc2407d5893f0cca6
By running "dpkg-buildpackages" you get Debian packages. I haven't
verified if they actually work yet.
Please read the commit message to get a better understanding of my ideas
behind it.
The current build system based around maven is designed for this way of
working. Maven is only taking responsibility for turning the source code into
artifacts and additional scripts can take those artifacts and combine them into
packages. This is explicitly done this way to not clutter a generic and
multiplatform java build process with very os specific packaging steps. The
packaging step should be something like, download sources as tarball, extract,
run mvn package (with any additional profiles you need), generate package from
the artifacts. In the packaging directory there is a centos63 spec file which
does this (see %build and %install sections)
Currently we have released the 4.0.0 version as a source release and some
members have graciously offered to compile, build and package the source and
make that available to the community for their convenience. I think the
explicit wish is that we are able to provide a distribution ourselves (meaning
the Apache CloudStack community) instead of just providing source code. This
brings us back to the original discussion regarding licensing and how to deal
with dependencies in packages.
Wido and myself had a lengthy discussion on this subject at ApacheConEU and we
haven't reached a conclusion yet, but a good enough understanding of the
problem that we can bring our ideas and discussions to the list.
One of the main arguments is how to deal with dependencies in packages. There
are a couple of options here:
* Create self-contained artifacts in step 2 (compile/build) that
contain all required dependencies. Packages made from these artifacts need no
outside dependencies save the bare essentials like java runtime, tomcat and
python.
* Create "bare" artifacts with just our compiled code in step 2
(compile/build). Packages need to include all dependencies for all jars that are
generated as part of our code.
* Hybrid mix and match of the two options above.
* The old waf build is even worse. We compile bare artifacts in step 2
(build/compile) and then package downloaded dependencies as far as the ASF
permits and let the package dependencies deal with the others (like
mysql-connector-java).
In my view one of the big issues here is version management of our dependencies.
If we want to ship packages for certain operating systems we need to align our
dependency versions to exactly the versions shipped by those distributions. I know
that usually a higher or lower version of a java dependency will just work, but
history has proven that you should not rely on this. If we have tested our version
of CloudStack with version 5.1.12, should we ship a package with a dependency on
that specific version of the mysql-connector-java, or hope for the best and allow
any version. (leading the witness I know.) We also have quite some dependencies
on ancient libraries (> 3 years old) that might not be available as packages at
all. Some dependencies might not even have packages at all or at least not in the
distributions repo.
Next question is where to put stuff we package. We toss a lot of things in
/usr/share/java for example, it some cases with the cloud- prefix so they don't
clash with system packages. Is this the right place/method to put our
dependencies?
My current take on all this is to go for the completely self-contained option,
meaning we package everything including dependencies into our artifacts and
completely ignore any system dependencies. This is motivated partially by my
idea that we should leave packaging into OS specific packages to the
distributions anyway, that way we don't have to deal with startup scripts,
configuration scripts and the whole bunch of distribution specific locations.
For me this give the advantage of not having to worry about specific versions
of system dependencies (and version conflicts with other java applications
installed via packages) and I don't have to deal with managing package
dependencies in our packaging code. However, as Chip pointed out today, these
packages could not be shipped from ASF infra as some of the binaries have
incompatible licenses (like mysql-connector-java). I know Wido is far more in
favor of packaging only the bare artifacts and leaving dependencies to the distr
ibutions
as much as possible, but he agrees with me that version management of specific dependencies is tricky.
Yes, I was in favor of having minimal/bare artifacts, but since I've
been playing with the self-contained artifacts I think my opinion has
changed.
The reason for this is that we indeed don't have to worry about what
version of a library a specific distribution is shipping in their repo.
Right now we "support" RHEL 6.3 and Ubuntu 12.04, but if we want to
expand that support we have to do a lot of research and work to actually
make it work and keep it working.
It's already hard enough to figure out the KVM/Qemu/libvirt dependencies
for other distributions.
As long as we bundle all the dependencies in one JAR/WAR I don't have a
problem with it. Yes, those files can become pretty big, right now the
agent is 12M and the server is 28M.
That's however not a problem when you are running your hypervisors on a
USB-stick or so.
Our JAR/WAR can be cleanly installed by a package manager and
de-installed again.
This way the KVM agent package only has:
* init script
* base configuration
* JAR file
* log directory
No dependencies on other Java packages or so, just that.
We don't have to worry about prefixing stuff with cloud-* or so to
prevent naming conflicts.
Yes, in the ideal world I'd like to see our dependencies come from the
distribution's repository, but we don't live in the ideal world.
We want to keep supporting Ubuntu 12.04 at least until April 2014 when
the next LTS comes out. Get my point? That means that for at least two
years we have to figure out Java dependencies. I'd rather not.
It will already be hard enough to keep working with a libvirt which is
over two years old in 2014.
Same goes ofcourse for RHEL, but I'm a Ubuntu guy :-)
Wido
I think it's a bridge to far now to deal with the packaging and license issues for
binary packages and their dependencies. I'd rather focus the effort on generating a
clear set of artifacts in a very clear directory structure that packagers can use to
generate their packages. I'm also not against of shipping spec files and the debian
build files, it would be great if somebody could just do "rpmbuild -tb
cloudstack-4.1.0.tar.gz' and have a set of packages after a few minutes, but very
limited to just one or two specific versions of distributions (or even just generate
a generic rpm and dep that just installs our files and leaves specific stuff to the
sysadmin)
So what do you guys think?
Cheers,
Hugo