Dear Nick and Uri, Thank you for your replies,
I'm not really into parallel processing and forgive me if I talk nonsense :). From your stories I sketch the following possible scenarios: 1. We use a sort of batch system. For instance, a Linux bash script with 10 recon-all commands or with 10 FSL feat commands. The system assigns each individual command to the node with the least load and the output data are automatically stored on a central (fi RAID) server. I guess this is the way the UCLA FSL/MAC grid works, since I can see no need to alter the software itself, if the applications are suitable to work from and write to an external computer. I'll contact the group exactly how they work. I guess we need very fast connections between the individual servers and the storage host to do this.... I don't know PBS, but I think this is exactly how the Sun grid engine works also, so maybe they are comparable. Batches are submitted to the host and are processed one at a time. In fact this is how I work myself right now sometimes when I'm in a hurry. I copy all my FSL source data to different computers and let each computer run a subset of the feat batch. By hand I group the data afterwards and I perform only the group averaging on a single computer. 2. We allow for more sophisticated parallelizing. You give the example of assigning the reconstruction of left and right hemispheres separately. This is the most elegant solution but implies rewriting the code. I think the second scenario would eventually be the most elegant but the first scenario would be less difficult to implement. In fact, I guess most applications (like FSL, freesurfer, SPM) work with batches, so if we have software that manages these batches we already have won a lot (and it saves a lot of manual work.... ;( What do you think? Thank you, Andries van der Leij PS: here's a presentation with screenshots of the sun grid: it looks fairly straightforward. http://www.sun.com/products-n-solutions/edu/whitepapers/pdf/bioinformati cs_supercomputer.pdf PS2: Nick, are you Dutch? -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of U. Hasson Sent: Wednesday, December 06, 2006 9:50 PM To: Nick Schmansky Cc: Andries van der Leij; Freesurfer Mailing List Subject: Re: [Freesurfer] Freesurfer and Grid computing At The Uni. of Chicago, we've been playing around with parallelizing freesurfer on a 128 node grid (256 processors). Developers have parallelized procedures (scripts) by unpacking "for" loops that rotate across left and right hemispheres [i.e., they fork the independent processing of left and right hemispheres to different nodes running in parallel, whenever possible]. The main point of this work is to acquire provenance records and therefore Freesurfer scripts are "wrapped" or expressed using a virtual data system language (VDS/VDL). The freesurfer implementation, AFAIK, is in its baby steps, but the general workflow model is pretty well established http://www.ci.uchicago.edu/wiki/bin/view/VDS/VDSWeb/WebMain Best, Uri On 12/6/06, Uri Hasson <[EMAIL PROTECTED]> wrote: > Here (Uni. of Chicago), we've been playing around with parallelizing > freesurfer on a 128 node grid (256 processors), and developers have > parallelized procedures (scripts) by unpacking "for" loops that > rotate across left and right hemispheres [i.e., they fork the > independent processing of left and right hemispheres to different > nodes running in parallel, whenever possible]. > > The main point of this work is to acquire provenance records and > therefore Freesurfer scripts are "wrapped" or expressed using a > virtual data system language (VDS/VDL). The freesurfer implementation, > AFAIK, is in its baby steps, but the general workflow model is pretty > well established > http://www.ci.uchicago.edu/wiki/bin/view/VDS/VDSWeb/WebMain > > Best, > Uri > > > On 12/6/06, Nick Schmansky <[EMAIL PROTECTED]> wrote: > > Andries, > > > > I am not aware of usage of Freesurfer in a (Sun) Grid Engine environment > > (such as that used by the Cohen group at UCLA). > > > > However, here at the MGH/MIT/HMS Martinos Center we use a cluster of > > some 100+ nodes configured with Linux Centos 4, and governed by PBS > > (Portable Batch System). Researchers here often conduct studies with > > dozens to hundreds of brains, and for each subject, an instance of > > Freesurfer's 'recon-all -s <subject> -all' script is submitted to the > > batch system, which, under the hood, gets submitted to one computing > > node. Thus, several dozen brains can be processed in a day (and a > > half). > > > > Freesurfer does not currently support fine-grain parallelism. Some > > coarse-grain parallelism, whereby each brain hemisphere is processed > > independently (benefiting multiprocessor nodes) is possible, but not > > currently implemented in our 'recon-all' script, as the error handling > > and logging for doing so is somewhat tricky (and so this feature is in- > > the-works-but-not-anytime-soon). > > > > In short, if you plan on using Freesurfer in studies with large numbers > > of subjects, I would recommend some kind of computing cluster, and some > > fairly simple batch software (like PBS) should be sufficient. For > > instance, I know of one group that has successfully run Freesurfer on > > their Altix Itanium Linux cluster. > > > > Groetjes, > > > > Nick > > > > > > On Wed, 2006-12-06 at 18:44 +0100, Andries van der Leij wrote: > > > > > > > > > > > > > > > > > > ______________________________________________________________________ > > > > > > From: Andries van der Leij > > > Sent: Wednesday, December 06, 2006 5:59 PM > > > To: 'freesurfer@nmr.mgh.harvard.edu' > > > Subject: Freesurfer and Grid computing > > > > > > > > > > > > > > > Dear Freesurfer community, > > > > > > > > > > > > I'm a PHD student at the university of Amsterdam and I'm currently > > > investigating the possibilities to streamline our MRI data processing > > > stream. Next summer we'll obtain a research-only scanner. I'm trying > > > to push the group to also invest in computing power and am currently > > > investigating the applications that researchers will most probably > > > use. > > > > > > > > > > > > I came across a project of the group of Cohen at UCLA. They have > > > configured a Apple (unix) grid and have proposed a more or less > > > standard setup specially designed for MRI analyses: > > > > > > > > > > > > http://airto.bmap.ucla.edu/mt- > > > static/NICluster/archives/2005/06/welcome.html > > > > > > > > > > > > It is my understanding that one of the members has rewritten the FSL > > > code which allow distributed parallel processing in a Grid. See the > > > benchmarks here: > > > > > > > > > > > > http://airto.bmap.ucla.edu/bmcweb/bmc_bios/MarkCohen/Apple/Benchmarks.ht m > > > > > > > > > > > > My question is fairly simple: Are similar steps taken in the > > > Freesurfer community? I have no experience with this app myself, but > > > it is my understanding that Freesurfer consumes a lot of resources. > > > > > > > > > > > > Thank you very much in advance, > > > > > > > > > > > > Andries van der Leij > > > > > > > > > _______________________________________________ > > > Freesurfer mailing list > > > Freesurfer@nmr.mgh.harvard.edu > > > https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer > > > > _______________________________________________ > > Freesurfer mailing list > > Freesurfer@nmr.mgh.harvard.edu > > https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer > > > _______________________________________________ Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer