Hi Ken On 12 July 2011 03:12, Ken Wesson <kwess...@gmail.com> wrote: > On Mon, Jul 11, 2011 at 7:51 PM, Mike Meyer <m...@mired.org> wrote: >> On Mon, 11 Jul 2011 16:21:45 -0400 >> Ken Wesson <kwess...@gmail.com> wrote: >>> > So, "repository" does not imply "server" at all, >>> This is getting silly. "Repository" is a word that brings immediately >>> to mind typing checkin and checkout commands at a command prompt in >>> order to work on source code that is stored remotely. And remotely >>> implies "server". >> >> I was with you until you said "stored remotely". > > Well, the source code is being worked on collaboratively by > geographically separated people in many cases, and from multiple > office cubicle computers in the most geographically-concentrated case. > The canonical master version of the source code must reside in some > single place, which can be at most one of those developers' computers; > so for the rest of them, it's stored remotely.
This may be the case, but a "repository" in no way implies that there are multiple developers involved. Just as (I thought we had agreed before this thread was revived) a "repository" in no way implies a network. Also, even if there is some central, authoritative repository, the individual developers may (e.g. in the case of Git or Mercurial) still have local repositories that are not inextricably linked to the central repository. > What's stored locally may range from a single file being worked on at > a time to a copy of the whole code base, but there is still generally > a master copy and there is still therefore some mechanism for updating > the master copy and resolving edit conflicts that collide there. The conflict resolution in CVS or Subversion or Mercurial or Git all happens locally. Not in some central server or repository. Some crude form of conflict detection might happen in the central repository, but so what? That does not mean that "repository => network", which was your original argument, and it does not mean that a repository is inherently shared between multiple developers. > That mechanism requires the computer holding the non-master copy or > single file to push its diffs to the computer holding the master copy > (and then the latter is a server) or the computer holding the master > copy to pull diffs from the others (and then all the REST are > servers!). If there is a central repository then yes, changes need to be pushed to it or pulled into it from elsewhere. But the existence of a networked repository or a repository shared on a multi-user machine with no networking involved does not preclude the existence of non-networked, single-user repositories. So again, a repository does not imply that there is a network involved. > So not only is there a "stored remotely" in there out of necessity for > anything but a single-developer-on-a-single-machine project, but > there's also a "server" in there, or even multiple servers. The > alternatives I can think of are: > > 1. One developer, one computer. Version control may be overkill in > such a case anyway and it's not how big, major projects are developed. I'd disagree that version control is overkill, but that has is irrelevant. Anyway, we're not talking about "big, major projects". We're talking about the meaning of the word "repository". > 2. Many developers, one computer. No "remote storage" and if the > developers are co-located no server; otherwise a terminal server. The > former is obviously not parallelizable (though edit conflicts are thus > a non-issue -- single global lock FTW!!!1!1) and the latter is a > throwback to the 1980s or earlier. :) > > 3. Many computers, one developer manually synching files or just > carrying the codebase around on a thumbdrive. No servers, no remote > storage that isn't simply overwritten when returned-to. The extra > copies, if any, merely amount to backups. Most likely with a > one-developer project with a tower and a laptop, or developing a phone > app on a computer and on their phone. > > 4. An ad hoc, peer-to-peer system with many evolving versions of the > codebase and patches swapped back and forth but no canonical "master" > copy. This *might* be workable on a small project (a handful of > developers, not too many LOC) but surely cannot scale much above that > without becoming total chaos. There might be no "server" beyond email > in such a case, used for exchanging diff files or whatever. But I > expect any project organized that way to melt down above a fairly > small size of codebase and/or developer headcount. Versions will > become too divergent, the bigger and more numerous they are, until > patches that worked on the sender's copy often won't work, or won't > work without extensive modification, on the recipient's, and then the > ability to share improvements begins to break down rapidly when that > point is reached. In effect, the codebases begin to evolve into > separate species that can no longer interbreed. Perhaps this is how > bacteria, despite being able to share genes in horizontal transfer and > acquire them after birth, nonetheless have distinct species -- they > become incompatible for all but certain broad classes of "plugin > interface implementation" patches such as, unfortunately, > antibiotic-resistance genes. > > This could be made slightly more scalable by modularizing, specifying > module interfaces, and letting the modules evolve separately, > versioning each one by patch level, so the version of a module is > bumped every time it's patched. Patches expected to be acquired and > applied in order to keep each module up to date with everyone else's > work. Obvious problem with patch-numbering collisions, where two > developers hack on module Y and both produce distinct patch 1337s > unaware of what the other is doing -- in other words, edit conflicts. > Repositories that number every commit sort-of implement this, but with > a master copy and a database of some sort tracking the changes and the > commit numbers and some mechanism for resolving edit collisions. > Collisions are also detected right away, since both developers will > submit their changes to the central repository. In the peer-to-peer > model each might deliver their own "patch 1337" to a bunch of others > and both could spread for a while among different subsets of the > developers before eventually colliding in one developer's copy of > module Y who receives one of the patches and then, later, the other. > Before that, problems could arise if developer X talks to developer Z > and gets patch 1338 for module Y, tries to apply it, and despite > having patch 1337 it doesn't work because patch 1338 depends on one > patch 1337 and developer X has the *other* patch 1337. Z tells him he > mustn't have patch 1337, he insists he does, confusion ensues until at > some point someone thinks to compare Xs and Zs patch 1337s and > discovers that they're not the same...whence, collision fun again. > > So, unless 4 can be made workable, then the typical use scenario for a > version control system involves both a) "remote storage" of code and > b) a "server". Their exact roles may vary. The exact way that > checkouts and commits work may vary. But without those two things in > *some* way shape or form the poorly-scaling anarchy of case 4 above > seems to me to be inevitable. Hell, even time sharing an old Unix > terminal server with a shell-and-emacs account for each developer > means remote code storage (on the Unix box) and a server (if only a > telnet/ssh server). :) I only skimmed the last bit. Sorry :) I don't think it changes the fact that "repository" does not imply a network. All the talk about best practices etc. are irrelevant to the argument. -- Michael Wood <esiot...@gmail.com> -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en