Hi Gus,

Interestingly the results for the connectivity_c test... works fine with -np 
<8. For -np >8 it works some of the time, other times it HANGS. I have got to 
believe that this is a big clue!! Also, when it hangs, sometimes I get the 
message "mpirun was unable to cleanly terminate the daemons on the nodes shown 
below" Note that NO nodes are shown below.   Once, I got -np 250 to pass the 
connectivity test, but I was not able to replicate this reliable, so I'm not 
sure if it was a fluke, or what.  Here is a like to a screenshop of TOP when 
connectivity_c is hung with -np 14.. I see that 2 processes are only at 50% CPU 
usage.. Hmmmm  

http://picasaweb.google.com/lh/photo/87zVEucBNFaQ0TieNVZtdw?authkey=Gv1sRgCLKokNOVqo7BYw&feat=directlink

The other tests, ring_c, hello_c, as well as the cxx versions of these guys 
with with all values of -np.

Using -mca mpi-paffinity_alone 1 I get the same behavior. 

I agree that I am should worry about the mismatch between where the libraries 
are installed versus where I am telling my programs to look for them. Would 
this type of mismatch cause behavior like what I am seeing, i.e. working with  
a small number of processors, but failing with larger?  It seems like a 
mismatch would have the same effect regardless of the number of processors 
used. Maybe I am mistaken. Anyway, to address this, which mpirun gives me 
/usr/local/bin/mpirun.. so to configure ./configure 
--with-mpi=/usr/local/bin/mpirun and to run /usr/local/bin/mpirun -np X ...  
This should 

uname -a gives me: Linux macmanes 2.6.31-16-generic #52-Ubuntu SMP Thu Dec 3 
22:07:16 UTC 2006 x86_64 GNU/Linux

Matt

On Dec 8, 2009, at 8:50 PM, Gus Correa wrote:

> Hi Matthew
> 
> Please see comments/answers inline below.
> 
> Matthew MacManes wrote:
>> Hi Gus, Thanks for your ideas.. I have a few questions, and will try to 
>> answer yours in hopes of solving this!!
> 
> A simple way to test OpenMPI on your system is to run the
> test programs that come with the OpenMPI source code,
> hello_c.c, connectivity_c.c, and ring_c.c:
> http://www.open-mpi.org/
> 
> Get the tarball from the OpenMPI site, gzip and untar it,
> and look for it in the "examples" directory.
> Compile it with /your/path/to/openmpi/bin/mpicc hello_c.c
> Run it with /your/path/to/openmpi/bin/mpiexec -np X a.out
> using X = 2, 4, 8, 16, 32, 64, ...
> 
> This will tell if your OpenMPI is functional,
> and if you can run on many Nehalem cores,
> even with oversubscription perhaps.
> It will also set the stage for further investigation of your
> actual programs.
> 
> 
>> Should I worry about setting things like --num-cores --bind-to-cores?  This, 
>> I think, gets at your questions about processor affinity.. Am I right? I 
>> could not exactly figure out the -mca mpi-paffinity_alone stuff...
> 
> I use the simple minded -mca mpi-paffinity_alone 1.
> This is probably the easiest way to assign a process to a core.
> There more complex  ways in OpenMPI, but I haven't tried.
> Indeed, -mca mpi-paffinity_alone 1 does improve performance of
> our programs here.
> There is a chance that without it the 16 virtual cores of
> your Nehalem get confused with more than 3 processes
> (you reported that -np > 3 breaks).
> 
> Did you try adding just -mca mpi-paffinity_alone 1  to
> your mpiexec command line?
> 
> 
>> 1. Additional load: nope. nothing else, most of the time not even firefox. 
> 
> Good.
> Turn off firefox, etc, to make it even better.
> Ideally, use runlevel 3, no X, like a computer cluster node,
> but this may not be required.
> 
>> 2. RAM: no problems apparent when monitoring through TOP. Interesting, I did 
>> wonder about oversubscription, so I tried the option --nooversubscription, 
>> but this gave me an error mssage.
> 
> Oversubscription from your program would only happen if
> you asked for more processes than available cores, i.e.,
> -np > 8 (or "virtual" cores, in case of Nehalem hyperthreading,
> -np > 16).
> Since you have -np=4 there is no oversubscription,
> unless you have other external load (e.g. Matlab, etc),
> but you said you don't.
> 
> Yet another possibility would be if your program is threaded
> (e.g. using OpenMP along with MPI), but considering what you
> said about OpenMP I would guess the programs don't use it.
> For instance, you launch the program with 4 MPI processes,
> and each process decides to start, say, 8 OpenMP threads.
> You end up with 32 threads and 8 (real) cores (or 16 hyperthreaded
> ones on Nehalem).
> 
> 
> What else does top say?
> Any hog processes (memory- or CPU-wise)
> besides your program processes?
> 
>> 3. I have not tried other MPI flavors.. Ive been speaking to the authors of 
>> the programs, and they are both using openMPI.  
> 
> I was not trying to convince you to use another MPI.
> I use MPICH2 also, but OpenMPI reigns here.
> The idea or trying it with MPICH2 was just to check whether OpenMPI
> is causing the problem, but I don't think it is.
> 
>> 4. I don't think that this is a problem, as I'm specifying 
>> --with-mpi=/usr/bin/...  when I compile the programs. Is there any other way 
>> to be sure that this is not a problem?
> 
> Hmmm ....
> I don't know about your Ubuntu (we have CentOS and Fedora on various
> machines).
> However, most Linux distributions come with their MPI flavors,
> and so do compilers, etc.
> Often times they install these goodies in unexpected places,
> and this has caused a lot of frustration.
> There are tons of postings on this list that eventually
> boiled down to mismatched versions of MPI in unexpected places.
> 
> 
> The easy way is to use full path names to compile and to run.
> Something like this:
> /my/openmpi/bin/mpicc on your program configuration script),
> 
> and something like this
> /my/openmpi/bin/mpiexec -np  ... bla, bla ...
> when you submit the job.
> 
> You can check your version with "which mpicc", "which mpiexec",
> and (perhaps using full path names) with
> "ompi_info", "mpicc --showme", "mpiexec --help".
> 
> 
>> 5. I had not been, and you could see some shuffling when monitoring the load 
>> on specific processors. I have tried to use --bind-to-cores to deal with 
>> this. I don't understand how to use the -mca options you asked about. 6. I 
>> am using Ubuntu 9.10. gcc 4.4.1 and g++  4.4.1
> 
> I am afraid I won't be of help, because I don't have Nehalem.
> However, I read about Nehalem requiring quite recent kernels
> to get all of its features working right.
> 
> What is the output of "uname -a"?
> This will tell the kernel version, etc.
> Other list subscribers may give you a suggestion if you post the
> information.
> 
>> MyBayes is a for bayesian phylogenetics:  
>> http://mrbayes.csit.fsu.edu/wiki/index.php/Main_Page ABySS: is a program for 
>> assembly of DNA sequence data: 
>> http://www.bcgsc.ca/platform/bioinfo/software/abyss
> 
> Thanks for the links!
> I had found the MrBayes link.
> I eventually found what your ABySS was about, but no links.
> Amazing that it is about DNA/gene sequencing.
> Our abyss here is the deep ocean ... :)
> Abysmal difference!
> 
>>> Do the programs mix MPI (message passing) with OpenMP (threads)? 
>> Im honestly not sure what this means..
> 
> Some programs mix the two.
> OpenMP only works in a shared memory environment (e.g. a single
> computer like yours), whereas MPI can use both shared memory
> and work across a network (e.g. in a cluster).
> There are other differences too.
> 
> Unlikely that you have this hybrid type of parallel program,
> otherwise there would be some reference to OpenMP
> on the very program configuration files, program documentation, etc.
> Also, in general the configuration scripts of these hybrid
> programs can turn on MPI only, or OpenMP only, or both,
> depending on how you configure.
> 
> Even to compile with OpenMP you would need a proper compiler
> flag, but that one might be hidden in a Makefile too, making
> a bit hard to find. "grep -n mp Makefile" may give a clue.
> Anything on the documentation that mentions threads or OpenMP?
> 
> FYI, here is OpenMP:
> http://openmp.org/wp/
> 
>> Thanks for all your help!
> > Matt
> 
> Well, so far it didn't really help. :(
> 
> But let's hope to find a clue,
> maybe with a little help of
> our list subscriber friends.
> 
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
> 
>>> Hi Matthew
>>> 
>>> More guesses/questions than anything else:
>>> 
>>> 1) Is there any additional load on this machine?
>>> We had problems like that (on different machines) when
>>> users start listening to streaming video, doing Matlab calculations,
>>> etc, while the MPI programs are running.
>>> This tends to oversubscribe the cores, and may lead to crashes.
>>> 
>>> 2) RAM:
>>> Can you monitor the RAM usage through "top"?
>>> (I presume you are on Linux.)
>>> It may show unexpected memory leaks, if they exist.
>>> 
>>> On "top", type "1" (one) see all cores, type "f" then "j"
>>> to see the core number associated to each process.
>>> 
>>> 3) Do the programs work right with other MPI flavors (e.g. MPICH2)?
>>> If not, then it is not OpenMPI's fault.
>>> 
>>> 4) Any possibility that the MPI versions/flavors of mpicc and
>>> mpirun that you are using to compile and launch the program are not the
>>> same?
>>> 
>>> 5) Are you setting processor affinity on mpiexec?
>>> 
>>> mpiexec -mca mpi_paffinity_alone 1 -np ... bla, bla ...
>>> 
>>> Context switching across the cores may also cause trouble, I suppose.
>>> 
>>> 6) Which Linux are you using (uname -a)?
>>> 
>>> On other mailing lists I read reports that only quite recent kernels
>>> support all the Intel Nehalem processor features well.
>>> I don't have Nehalem, I can't help here,
>>> but the information may be useful
>>> for other list subscribers to help you.
>>> 
>>> ***
>>> 
>>> As for the programs, some programs require specific setup,
>>> (and even specific compilation) when the number of MPI processes
>>> vary.
>>> It may help if you tell us a link to the program sites.
>>> 
>>> Baysian statistics is not totally out of our business,
>>> but phylogenetic genetic trees is not really my league,
>>> hence forgive me any bad guesses, please,
>>> but would it need specific compilation or a different
>>> set of input parameters to run correctly on a different
>>> number of processors?
>>> Do the programs mix MPI (message passing) with OpenMP (threads)?
>>> 
>>> I found this MrBayes, which seems to do the above:
>>> 
>>> http://mrbayes.csit.fsu.edu/
>>> http://mrbayes.csit.fsu.edu/wiki/index.php/Main_Page
>>> 
>>> As for the ABySS, what is it, where can it be found?
>>> Doesn't look like a deep ocean circulation model, as the name suggest.
>>> 
>>> My $0.02
>>> Gus Correa 
>> ------------------------------------------------------------------------
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

_________________________________
Matthew MacManes
PhD Candidate
University of California- Berkeley
Museum of Vertebrate Zoology
Phone: 510-495-5833
Lab Website: http://ib.berkeley.edu/labs/lacey
Personal Website: http://macmanes.com/





Reply via email to