Durga I guess we have strayed a bit from the original post. My personal opinion is that a number of codes can run in HPC-like mode over Gigabit ethernet, not just the trivially parallelizable. The hardware components are one key; PCI-X, low hardware latency NIC (Intel PRO 1000 is 6.6 microsecs vs about 14 for the Bcom 5721), and a non-blocking (that's the key word) switch. Then you need a good driver and a good MPI software layer. At present MPICH is ahead of LAM/OpenMPI/MVAPICH in its implementation of optimized collectives. At least that's how it seems to me (let me say that quickly, before I get flamed). MPICH got a bad rap performance wise because its TCP driver was mediocre (compared with LAM and OpenMPI). But MPICH + GAMMA is very fast. MPIGAMMA even beats out our Infiniband cluster running OpenMPI on the MPI_Allreduce; the test was with 64 cpus-32 nodes on the GAMMA cluster (Dual core P4) and 16 nodes on the Infiniband (Dual Dual-core Opterons). The IB cluster worked out at 24MBytes/sec (vector size/time) and the GigE + MPIGAMMA was 39MBytes/sec. On the other hand, if I use my own optimized AllReduce (a simplified version of the one in MPICH) on the IB cluster it gets 108MByte/sec. So the tricky thing is all the components need to be in place to get good application performance.
GAMMA is not so easy to set up-I had considerable help from Giuseppe. It has libraries to compile and the kernel needs to be recompiled. Once I got that automated I can build and install a new version of GAMMA in about 5 mins. The MPIGAMMA build is just like MPICH and MPIGAMMA works almost exactly the same. So any application that will compile under MPICH should compile under MPIGAMMA, just by changing the path. I have run half a dozen apps with GAMMA. Netpipe, Netbench (my network tester-a simplified version of IMB), Susp3D (my own code-a CFD like application), DLPOLY all compile out of the box. Gromacs compiles but has a couple of "bugs" that crash on execution. One is an archaic test for MPICH that prevents a clean exit-must have been a bugfix for an earlier version of MPICH. The other seems to be an fclose of an unassigned file pointer. It works OK in LAM but my guess is its illegal strictly speaking. A student was supposed to check on that. VASP also compiles out of the box if you can compile it with MPICH. But there is a problem with the MPIGAMMA and the MPI_Alltoall function right now. It works but it suffers from hangups and long delays. So GAMMA is not good for VASP at this moment. You see the substantial performance improvements sometimes, but other times its dreadfully slow. I can reproduce the problem with an AlltoAll test code and Giuseppe is going to try to debug the problem. So GAMMA is not a pancea. In most circumstances it is stable and predictable; much more reproducble than MPI over TCP. But there are still may be one or two bugs and several issues. 1) Since GAMMA is tightly entwined in the kernel a crash frequently brings the whole system down, which is a bit annoying; also it can crash other nodes in the same GAMMA Virtual Machine. 2) NIC's are very buggy hardware-if you look at a TCP driver there are a large number of hardware bugfixes in them. A number of GAMMA problems can be traced to this. It's a lot of work to reprogram all the workarounds. 3) GAMMA nodes have to be preconfigured at boot. You can run more than one job on a GAMMA virtual machine, but it's a little iffy; there can be interactions between nodes on the same VM even if they are running different jobs. Different GAMMA VM's need a different VLAN. So a multiuser environment is still problematic. 4) Giuseppe said MPIGAMMA was a very difficult code to write-so I would guess a port to OpenMPI would not be trivial. Also I would want to see optimized collectives in OpenMPI before I switched from MPICH As far as I know GAMMA is the most advanced non TCP protocol. At core it really works well, but it still needs a lot more testing and development. Giuseppe is great to work with if anyone out there is interested. Go to the MPIGAMMA website for more info http://www.disi.unige.it/project/gamma/mpigamma/index.html. Tony