On Jun 14, 2005, at 9:54 PM, Scott Feldman wrote:

We're a quiet bunch.  :-)

Which is a bad thing for Open Source development.  It seems Open MPI is
closed-source development project with an open-source release model.

At this point in our development, I somewhat agree. But this will soon change. See below.

The FAQ claims the future is in Open Source code, methodology, and
philosophy; so why is the development and testing of Open MPI closed?

It is "somewhat" closed. We have actually had a limited set of third parties involved from the very beginning -- people outside the core Open MPI development team who have tested the code, reported bugs, etc. Such reports have been extremely useful and helpful.

I fully appreciate the benefit of the open source model. Indeed, my parent organization's name is The Open Systems Laboratory (www.osl.iu.edu), which contributes and develops to several well-known open source projects. For example, we've been developing and maintaining LAM/MPI under an open source model for several years, and it works well. As of now, Open MPI is slightly different (see below for why), but will be shifting to a more open source model (like LAM's) in the near future.

Closed-source development doesn't scale.  You're missing out on early
bug reports from users with environments and applications different
than yours.  You're missing out on outside development help in finding
and fixing bugs.

There are several reasons that we have not yet released the code to the public:

1. The first 8 months or so of our project were in "stealth" mode. We honestly didn't know if the collaboration would eventually bear useful fruit. As the project went on, we came to see that it was working (really well, actually), and so we came out of stealth mode, announced it to the world, created the public web site, etc.

2. Once the alpha was available, we didn't want novice users downloading the code, thinking that it was a fully-functional MPI implementation. Even with oodles of warnings on the web site and/or in the tarball itself, some people would definitely try it and send unhelpful "it doesn't work!" flame mails. Hence, it was released to a closed set of 3rd parties who were known to be knowledgeable about MPI and would be able to generate useful bug reports. Their assistance was invaluable to us.

3. The HPC community is quite small, and the competition is quite fierce. We have direct and distinct competition in this space; having a bad release would negatively impact this project and greatly harm our chances of ever having a good release (at least in the eyes of the public). This is an unfortunate reality.

4. Adding more developers to a project does not make it release faster (indeed, it usually slows it down). This is true for any committee model -- the more people on a project, the more opinions need to get discussed and more compromises need to be reached. This is not necessarily a Bad Thing, of course -- independent, outside opinions can shed unique insight into problems -- but it does mean that it takes more time. As I indicated in a previous post, we're taking longer than we expected in terms of the code (truth be told, we had really hoped to be at beta quality by SC last year -- that didn't happen). Adding more developers right now would inevitably make this project take yet more time before releasing -- which we really don't have at this point. We need to get to stability and release, if for no other reason than to answer mails like this.

5. We really wanted to reach some level of stability before we opened to the public. We felt that this would be the best way to make a positive contribution to the HPC community -- present code that works, and then move forward from there. For all the interesting / research-worthy parts of an MPI implementation, there's 10 times that amount of code that is totally uninteresting / maintenance-requiring / internal accounting code and data structures that the MPI has to maintain. Very, very few people outside the core MPI development team will ever look at or care about this kind of code (this has been our experience in other MPI implementations). Specifically, what I mean is that we anticipate that almost no one outside of us will look at 90% of the Open MPI code base -- 3rd party vendors and researchers will be focusing on the 10% of the code base which is performance critical. Unfortunately, the other 90% is what takes a large portion of the time to develop and debug, and is quite useless to 3rd parties if a) it doesn't work, or b) changes quickly enough that it makes working in the 10%-performance-critical parts painful.

6. We're still working through the legal issues to get an Apache-like structure in place to a) guarantee that the code will always be in open source, and b) ensure that all contributions from 3rd parties are "clean" in terms of intellectual property (can you say "SCO"?). This has unfortunately taken *WAY* longer than we anticipated, and perhaps the biggest reason that we have not invited in 3rd party developers yet.

Please adopt a release-early, release-often strategy.

Actually, this is something that we will desperately try to avoid. Open source does not necessarily equal "release early, release often". IMHO, that methodology tends to imply that at least one reason you have to release often because there are bugs that need to be fixed. For production-quality software, you really want to release stable software. There are always those who want to be out on the bleeding edge of development with the latest / greatest software (despite the fact that it may not be stable), but in our experience, the vast majority of MPI users just want software that works and don't care about many esoteric features. They just want to run their MPI codes and get stable, repeatable answers.

My experience with LAM/MPI is specifically what I am citing -- indeed, there are still many users who are using [extremely] old versions of LAM/MPI simply because that's what they started developing / using years ago and it still works for them ("it ain't broke, so why fix it?"). I cannot speak for Los Alamos, but FWIW, I think that LA-MPI's experience is similar (they work in a production-quality environment -- slow, production-quality release cycles).

However, to accommodate both kinds of users in LAM/MPI (those who want stability and those who want bleeding edge), we adopted a dual-headed strategy:

1. Slow formal release cycle. LAM/MPI typically has 1-3 releases a year. Usually one major release with a small number of bug fix releases following it.

2. Nightly tarball snapshots available. Anyone who wants to can grab either a Subversion checkout or a nightly snapshot tarball, but no guarantees are made about its stability (because it represents active development).

I anticipate that something analogous will occur for Open MPI.

"Show us the code!"

I have a long, public track record of high-quality open source software, and am firmly committed to make Open MPI be in the same category.

We will show you the code soon, I promise. We've come too far to *not* do so! :-)

--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/

Reply via email to