Re: Build tool for mixed Clojure/Java projects

Ken Wesson Mon, 11 Jul 2011 18:12:35 -0700

On Mon, Jul 11, 2011 at 7:51 PM, Mike Meyer <m...@mired.org> wrote:
> On Mon, 11 Jul 2011 16:21:45 -0400
> Ken Wesson <kwess...@gmail.com> wrote:
>> > So, "repository" does not imply "server" at all,
>> This is getting silly. "Repository" is a word that brings immediately
>> to mind typing checkin and checkout commands at a command prompt in
>> order to work on source code that is stored remotely. And remotely
>> implies "server".
>
> I was with you until you said "stored remotely".


Well, the source code is being worked on collaboratively by
geographically separated people in many cases, and from multiple
office cubicle computers in the most geographically-concentrated case.
The canonical master version of the source code must reside in some
single place, which can be at most one of those developers' computers;
so for the rest of them, it's stored remotely.

What's stored locally may range from a single file being worked on at
a time to a copy of the whole code base, but there is still generally
a master copy and there is still therefore some mechanism for updating
the master copy and resolving edit conflicts that collide there.

That mechanism requires the computer holding the non-master copy or
single file to push its diffs to the computer holding the master copy
(and then the latter is a server) or the computer holding the master
copy to pull diffs from the others (and then all the REST are
servers!).

So not only is there a "stored remotely" in there out of necessity for
anything but a single-developer-on-a-single-machine project, but
there's also a "server" in there, or even multiple servers. The
alternatives I can think of are:

1. One developer, one computer. Version control may be overkill in
such a case anyway and it's not how big, major projects are developed.

2. Many developers, one computer. No "remote storage" and if the
developers are co-located no server; otherwise a terminal server. The
former is obviously not parallelizable (though edit conflicts are thus
a non-issue -- single global lock FTW!!!1!1) and the latter is a
throwback to the 1980s or earlier. :)

3. Many computers, one developer manually synching files or just
carrying the codebase around on a thumbdrive. No servers, no remote
storage that isn't simply overwritten when returned-to. The extra
copies, if any, merely amount to backups. Most likely with a
one-developer project with a tower and a laptop, or developing a phone
app on a computer and on their phone.

4. An ad hoc, peer-to-peer system with many evolving versions of the
codebase and patches swapped back and forth but no canonical "master"
copy. This *might* be workable on a small project (a handful of
developers, not too many LOC) but surely cannot scale much above that
without becoming total chaos. There might be no "server" beyond email
in such a case, used for exchanging diff files or whatever. But I
expect any project organized that way to melt down above a fairly
small size of codebase and/or developer headcount. Versions will
become too divergent, the bigger and more numerous they are, until
patches that worked on the sender's copy often won't work, or won't
work without extensive modification, on the recipient's, and then the
ability to share improvements begins to break down rapidly when that
point is reached. In effect, the codebases begin to evolve into
separate species that can no longer interbreed. Perhaps this is how
bacteria, despite being able to share genes in horizontal transfer and
acquire them after birth, nonetheless have distinct species -- they
become incompatible for all but certain broad classes of "plugin
interface implementation" patches such as, unfortunately,
antibiotic-resistance genes.

This could be made slightly more scalable by modularizing, specifying
module interfaces, and letting the modules evolve separately,
versioning each one by patch level, so the version of a module is
bumped every time it's patched. Patches expected to be acquired and
applied in order to keep each module up to date with everyone else's
work. Obvious problem with patch-numbering collisions, where two
developers hack on module Y and both produce distinct patch 1337s
unaware of what the other is doing -- in other words, edit conflicts.
Repositories that number every commit sort-of implement this, but with
a master copy and a database of some sort tracking the changes and the
commit numbers and some mechanism for resolving edit collisions.
Collisions are also detected right away, since both developers will
submit their changes to the central repository. In the peer-to-peer
model each might deliver their own "patch 1337" to a bunch of others
and both could spread for a while among different subsets of the
developers before eventually colliding in one developer's copy of
module Y who receives one of the patches and then, later, the other.
Before that, problems could arise if developer X talks to developer Z
and gets patch 1338 for module Y, tries to apply it, and despite
having patch 1337 it doesn't work because patch 1338 depends on one
patch 1337 and developer X has the *other* patch 1337. Z tells him he
mustn't have patch 1337, he insists he does, confusion ensues until at
some point someone thinks to compare Xs and Zs patch 1337s and
discovers that they're not the same...whence, collision fun again.

So, unless 4 can be made workable, then the typical use scenario for a
version control system involves both a) "remote storage" of code and
b) a "server". Their exact roles may vary. The exact way that
checkouts and commits work may vary. But without those two things in
*some* way shape or form the poorly-scaling anarchy of case 4 above
seems to me to be inevitable. Hell, even time sharing an old Unix
terminal server with a shell-and-emacs account for each developer
means remote code storage (on the Unix box) and a server (if only a
telnet/ssh server). :)

-- 
Protege: What is this seething mass of parentheses?!
Master: Your father's Lisp REPL. This is the language of a true
hacker. Not as clumsy or random as C++; a language for a more
civilized age.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Build tool for mixed Clojure/Java projects

Reply via email to