Hi Gus, 

Thanks for your ideas.. I have a few questions, and will try to answer yours in 
hopes of solving this!!

Should I worry about setting things like --num-cores --bind-to-cores?  This, I 
think, gets at your questions about processor affinity.. Am I right? I could 
not exactly figure out the -mca mpi-paffinity_alone stuff...

1. Additional load: nope. nothing else, most of the time not even firefox. 
2. RAM: no problems apparent when monitoring through TOP. Interesting, I did 
wonder about oversubscription, so I tried the option --nooversubscription, but 
this gave me an error mssage.
3. I have not tried other MPI flavors.. Ive been speaking to the authors of the 
programs, and they are both using openMPI.  
4. I don't think that this is a problem, as I'm specifying 
--with-mpi=/usr/bin/...  when I compile the programs. Is there any other way to 
be sure that this is not a problem?
5. I had not been, and you could see some shuffling when monitoring the load on 
specific processors. I have tried to use --bind-to-cores to deal with this. I 
don't understand how to use the -mca options you asked about. 
6. I am using Ubuntu 9.10. gcc 4.4.1 and g++  4.4.1


MyBayes is a for bayesian phylogenetics:  
http://mrbayes.csit.fsu.edu/wiki/index.php/Main_Page 
ABySS: is a program for assembly of DNA sequence data: 
http://www.bcgsc.ca/platform/bioinfo/software/abyss

> Do the programs mix MPI (message passing) with OpenMP (threads)? 
> 
Im honestly not sure what this means..

Thanks for all your help!

Matt

>  Hi Matthew 
> More guesses/questions than anything else: 
> 1) Is there any additional load on this machine? 
> We had problems like that (on different machines) when 
> users start listening to streaming video, doing Matlab calculations, 
> etc, while the MPI programs are running. 
> This tends to oversubscribe the cores, and may lead to crashes. 
> 2) RAM: 
> Can you monitor the RAM usage through "top"? 
> (I presume you are on Linux.) 
> It may show unexpected memory leaks, if they exist. 
> On "top", type "1" (one) see all cores, type "f" then "j" 
> to see the core number associated to each process. 
> 3) Do the programs work right with other MPI flavors (e.g. MPICH2)? 
> If not, then it is not OpenMPI's fault. 
> 4) Any possibility that the MPI versions/flavors of mpicc and 
> mpirun that you are using to compile and launch the program are not the 
> same? 
> 5) Are you setting processor affinity on mpiexec? 
> mpiexec -mca mpi_paffinity_alone 1 -np ... bla, bla ... 
> Context switching across the cores may also cause trouble, I suppose. 
> 6) Which Linux are you using (uname -a)? 
> On other mailing lists I read reports that only quite recent kernels 
> support all the Intel Nehalem processor features well. 
> I don't have Nehalem, I can't help here, 
> but the information may be useful 
> for other list subscribers to help you. 
> *** 
> As for the programs, some programs require specific setup, 
> (and even specific compilation) when the number of MPI processes 
> vary. 
> It may help if you tell us a link to the program sites. 
> Baysian statistics is not totally out of our business, 
> but phylogenetic genetic trees is not really my league, 
> hence forgive me any bad guesses, please, 
> but would it need specific compilation or a different 
> set of input parameters to run correctly on a different 
> number of processors? 
> Do the programs mix MPI (message passing) with OpenMP (threads)? 
> I found this MrBayes, which seems to do the above: 
> http://mrbayes.csit.fsu.edu/ 
> http://mrbayes.csit.fsu.edu/wiki/index.php/Main_Page 
> As for the ABySS, what is it, where can it be found? 
> Doesn't look like a deep ocean circulation model, as the name suggest. 
> My $0.02 
> Gus Correa 

Reply via email to