Nick and Uri: thank you for your replies,

I'm not really into parallel processing and forgive me if I talk
nonsense :). From your stories I sketch the following possible
scenarios:

1. We use a sort of batch system. For instance, a Linux bash script with
10 recon-all commands or with 10 FSL feat commands. The system assigns
each individual command to the node with the least load and the output
data are automatically stored on a central (fi RAID) server. I guess
this is the way the UCLA FSL/MAC grid works, since I can see no need to
alter the software itself, if the applications are suitable to work from
and write to an external computer. I'll contact the group exactly how
they work. I guess we need very fast connections between the individual
servers and the storage host to do this.... I don't know PBS, but I
think this is exactly how the Sun grid engine works also, so maybe they
are comparable. Batches are submitted to the host and are processed one
at a time. In fact this is how I work myself right now sometimes when
I'm in a hurry. I copy all my FSL source data to different computers and
let each computer run a subset of the feat batch. By hand I group the
data afterwards and I perform only the group averaging on a single
computer.

2. We allow for more sophisticated parallelizing. You give the example
of assigning the reconstruction of left and right hemispheres
separately. This is the most elegant solution but implies rewriting the
code.

I think the second scenario would eventually be the most elegant but the
first scenario would be less difficult to implement. In fact, I guess
most applications (like FSL, freesurfer, SPM) work with batches, so if
we have software that manages these batches we already have won a lot
(and it saves a lot of manual work.... ;(

What do you think?

Thank you,

Andries van der Leij

PS: here's a presentation with screenshots of the sun grid: it looks
fairly straightforward.

http://www.sun.com/products-n-solutions/edu/whitepapers/pdf/bioinformati
cs_supercomputer.pdf

PS2: Nick, are you Dutch?



-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of U.
Hasson
Sent: Wednesday, December 06, 2006 9:50 PM
To: Nick Schmansky
Cc: Andries van der Leij; Freesurfer Mailing List
Subject: Re: [Freesurfer] Freesurfer and Grid computing

At The Uni. of Chicago, we've been playing around with parallelizing
freesurfer on a 128 node grid (256 processors).  Developers have
parallelized  procedures (scripts) by unpacking "for" loops that
rotate across left and right hemispheres [i.e., they fork the
independent processing of left and right hemispheres to different
nodes running in parallel, whenever possible].

The main point of this work is  to acquire provenance records and
therefore Freesurfer scripts are "wrapped" or expressed using a
virtual data system language (VDS/VDL). The freesurfer implementation,
AFAIK, is in its baby steps, but the general workflow model is pretty
well established
http://www.ci.uchicago.edu/wiki/bin/view/VDS/VDSWeb/WebMain

Best,
Uri

On 12/6/06, Uri Hasson <[EMAIL PROTECTED]> wrote:
> Here (Uni. of Chicago), we've been playing around with parallelizing
> freesurfer on a 128 node grid (256 processors), and  developers have
> parallelized  procedures (scripts) by unpacking "for" loops that
> rotate across left and right hemispheres [i.e., they fork the
> independent processing of left and right hemispheres to different
> nodes running in parallel, whenever possible].
>
> The main point of this work is  to acquire provenance records and
> therefore Freesurfer scripts are "wrapped" or expressed using a
> virtual data system language (VDS/VDL). The freesurfer implementation,
> AFAIK, is in its baby steps, but the general workflow model is pretty
> well established
> http://www.ci.uchicago.edu/wiki/bin/view/VDS/VDSWeb/WebMain
>
> Best,
> Uri
>
>
> On 12/6/06, Nick Schmansky <[EMAIL PROTECTED]> wrote:
> > Andries,
> >
> > I am not aware of usage of Freesurfer in a (Sun) Grid Engine
environment
> > (such as that used by the Cohen group at UCLA).
> >
> > However, here at the MGH/MIT/HMS Martinos Center we use a cluster of
> > some 100+ nodes configured with Linux Centos 4, and governed by PBS
> > (Portable Batch System).  Researchers here often conduct studies
with
> > dozens to hundreds of brains, and for each subject, an instance of
> > Freesurfer's 'recon-all -s <subject> -all' script is submitted to
the
> > batch system, which, under the hood, gets submitted to one computing
> > node.  Thus, several dozen brains can be processed in a day (and a
> > half).
> >
> > Freesurfer does not currently support fine-grain parallelism.  Some
> > coarse-grain parallelism, whereby each brain hemisphere is processed
> > independently (benefiting multiprocessor nodes) is possible, but not
> > currently implemented in our 'recon-all' script, as the error
handling
> > and logging for doing so is somewhat tricky (and so this feature is
in-
> > the-works-but-not-anytime-soon).
> >
> > In short, if you plan on using Freesurfer in studies with large
numbers
> > of subjects, I would recommend some kind of computing cluster, and
some
> > fairly simple batch software (like PBS) should be sufficient.  For
> > instance, I know of one group that has successfully run Freesurfer
on
> > their Altix Itanium Linux cluster.
> >
> > Groetjes,
> >
> > Nick
> >
> >
> > On Wed, 2006-12-06 at 18:44 +0100, Andries van der Leij wrote:
> > >
> > >
> > >
> > >
> > >
> > >
______________________________________________________________________
> > >
> > > From: Andries van der Leij
> > > Sent: Wednesday, December 06, 2006 5:59 PM
> > > To: 'freesurfer@nmr.mgh.harvard.edu'
> > > Subject: Freesurfer and Grid computing
> > >
> > >
> > >
> > >
> > > Dear Freesurfer community,
> > >
> > >
> > >
> > > I'm a PHD student at the university of Amsterdam and I'm currently
> > > investigating the possibilities to streamline our MRI data
processing
> > > stream. Next summer we'll obtain a research-only scanner. I'm
trying
> > > to push the group to also invest in computing power and am
currently
> > > investigating the applications that researchers will most probably
> > > use.
> > >
> > >
> > >
> > > I came across a project of the group of Cohen at UCLA. They have
> > > configured a Apple (unix) grid and have proposed a more or less
> > > standard setup specially designed for MRI analyses:
> > >
> > >
> > >
> > > http://airto.bmap.ucla.edu/mt-
> > > static/NICluster/archives/2005/06/welcome.html
> > >
> > >
> > >
> > > It is my understanding that one of the members has rewritten the
FSL
> > > code which allow distributed parallel processing in a Grid. See
the
> > > benchmarks here:
> > >
> > >
> > >
> > >
http://airto.bmap.ucla.edu/bmcweb/bmc_bios/MarkCohen/Apple/Benchmarks.ht
m
> > >
> > >
> > >
> > > My question is fairly simple: Are similar steps taken in the
> > > Freesurfer community? I have no experience with this app myself,
but
> > > it is my understanding that Freesurfer consumes a lot of
resources.
> > >
> > >
> > >
> > > Thank you very much in advance,
> > >
> > >
> > >
> > > Andries van der Leij
> > >
> > >
> > > _______________________________________________
> > > Freesurfer mailing list
> > > Freesurfer@nmr.mgh.harvard.edu
> > > https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
> >
> > _______________________________________________
> > Freesurfer mailing list
> > Freesurfer@nmr.mgh.harvard.edu
> > https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
> >
>

_______________________________________________
Freesurfer mailing list
Freesurfer@nmr.mgh.harvard.edu
https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer

Reply via email to