Shout out to my fellow Flocker, Matt...
-------- Original Message --------
Subject: RFC: Proposal for a more agile "" (draft of my
Flock talk)
Date: Mon, 22 Jul 2013 09:38:54 -0400
From: Matthew Miller <>
Reply-To: Development discussions related to Fedora
To: Fedora Development List <>
Obviously, no-bundled-libs is a crucial part of the packaging guidelines
today. As a sysadmin, I know why it's important. This is not just a noble
goal, but also something that pragmatically makes systems better. But,
also keeping us from having software that people really use in Fedora.
and Hadoop are two big examples. This hurts us more than it helps the
So, in some areas, we need a different approach.
The Big Data SIG is trying to adapt Hadoop 2.x into Fedora for F20
<>, and I'll be sharing our
insights on this at Flock <> in a couple of
weeks. In Matt's conceptual architecture I suppose Hadoop Common would
live in the Ring 2-to-3 orbit somewhere. It is a core in it's own right
(it provides a distributed, replicated file system) in that there is an
every growing software ecosystem that has emerged around it, and the SIG
would like Fedora to be the OS of choice for that ecosystem. Stable
enough for deployment but a feature-rich, current and productive
environment for the developers in that evolving ecosystem. The Hadoop
runtime is an orchestration of JVM-based daemons which can be viewed as
system-level services, thus an obvious candidate for well-defined
integration with Fedora via packaging: correct permissions, systemd
scripts, logs, etc.
However, the root of that core is a set of older and deprecated Java
dependencies (e.g., Jetty 6, Tomcat 5.5) which are expressed via the
Apache Maven build tool. The "quick and dirty" label used by another
poster of a VERY popular build tool like Maven does it a disservice. The
fact is that it is exceedingly popular in the Java development community
and has been for some time. Anyway, the challenge for this project is
the reconciliation of it's stable dependencies versus the ever-changing
bleeding edge that is typically found in the latest Fedora release. A
lot of our efforts so far have been the various API and build
specification changes necessary to try to make Hadoop fit into Fedora.
So far, so good...sort of. We can make the basic use case and tests work
with the modified dependencies but in doing so we risk giving up parity
with the Apache baseline (including the JRE) and potentially lose out to
other so-called "dirty RPMs". Ideally, we wouldn't be forced into some
of these adaptations and compromises if there were Fedora packaging
alternatives that would give us (a SIG ring?) more control over the
bundles needed by Hadoop as opposed to the ones mandated by the latest
Fedora release. Make no mistake: patches are fed from the SIG to the
Hadoop community to try to bump the versions there. But the upstream
project can't and won't chase an ever-vanishing point in the distance.
They view their lower dependencies much like a stable OS such as RHEL
and change should be deliberated there.
I feel like Matt has at least kick-started the discussion around how
Fedora could evolve to support orthogonal dependency models that more
readily adapt to external projects like Hadoop. Not that our SIG has any
profound answers. :-)
Thus, we are very interested in _any_ packaging architecture proposals
that could help relieve our initiative's pain points, and look forward
to further constructive discussion of the same.
My $0.02,
Peter MacKinnon
MRG Grid/Big Data
Red Hat Inc.
Raleigh, NC
devel mailing list