[OMPI users] busy waiting and oversubscriptions

2014-03-25 Thread Ross Boylan
Even when "idle", MPI processes use all the CPU. I thought I remember someone saying that they will be low priority, and so not pose much of an obstacle to other uses of the CPU. At any rate, my question is whether, if I have processes that spend most of their time waiting to receive a message, I

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-03-25 Thread Dave Love
Edgar Gabriel writes: > yes, the patch has been submitted to the 1.6 branch for review, not sure > what the precise status of it is. The problems found are more or less > independent of the PVFS2 version. Thanks; I should have looked in the tracker.

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-03-25 Thread Edgar Gabriel
not sure honestly. Basically, as suggested in this email chain earlier, I had to disable the PVFS2_IreadContig and PVFS2_IwriteContig routines in ad_pvfs2.c to make the tests pass. Otherwise the tests worked but produced wrong data. I did not have however the time to figure what actually goes wrong

Re: [OMPI users] Help building/installing a working Open MPI 1.7.4 on OS X 10.9.2 with Free PGI Fortran

2014-03-25 Thread Jeff Squyres (jsquyres)
Got your output -- thanks. I'm pretty sure this is pointing to a Libtool bug. Here's the interesting part -- it looks like Libtool simply isn't issuing the command to create the library (!). Check out this (annotated) output from "make V=1" on a Linux/gfortran box: Making all in src make

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-03-25 Thread Rob Latham
On 03/25/2014 07:32 AM, Dave Love wrote: Edgar Gabriel writes: I am still looking into the PVFS2 with ROMIO problem with the 1.6 series, where (as I mentioned yesterday) the problem I am having right now is that the data is wrong. Not sure what causes it, but since I have teach this afternoo

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-03-25 Thread Jeff Squyres (jsquyres)
Sorry -- we've been focusing on 1.7.5 and the impending 1.8 release; I probably won't be able to look at the v1.6 version in the next 2 weeks or so. On Mar 25, 2014, at 9:09 AM, Edgar Gabriel wrote: > yes, the patch has been submitted to the 1.6 branch for review, not sure > what the precise s

Re: [OMPI users] coll_ml_priority in openmpi-1.7.5

2014-03-25 Thread Jeff Squyres (jsquyres)
Yes, Nathan has a few coll ml fixes queued up for 1.8. On Mar 24, 2014, at 10:11 PM, tmish...@jcity.maeda.co.jp wrote: > > > I ran our application using the final version of openmpi-1.7.5 again > with coll_ml_priority = 90. > > Then, coll/ml was actually activated and I got these error message

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-03-25 Thread Edgar Gabriel
yes, the patch has been submitted to the 1.6 branch for review, not sure what the precise status of it is. The problems found are more or less independent of the PVFS2 version. Thanks Edga On 3/25/2014 7:32 AM, Dave Love wrote: > Edgar Gabriel writes: > >> I am still looking into the PVFS2 with

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-03-25 Thread Dave Love
Edgar Gabriel writes: > I am still looking into the PVFS2 with ROMIO problem with the 1.6 > series, where (as I mentioned yesterday) the problem I am having right > now is that the data is wrong. Not sure what causes it, but since I have > teach this afternoon again, it might be friday until I ca

Re: [OMPI users] problem for multiple clusters using mpirun

2014-03-25 Thread Jeff Squyres (jsquyres)
This is very odd -- the default value for btl_tcp_port_min_v4 is 1024. So unless you have overridden this value, you should not be getting a port less than 1024. You can run this to see: ompi_info --level 9 --param btl tcp --parsable | grep port_min_v4 Mine says this in a default 1.7.5 insta

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-25 Thread Hamid Saeed
Hello, Thanks i figured out what was the exact problem in my case. Now i am using the following execution line. it is directing the mpi comm port to start from 1... mpiexec -n 2 --host karp,wirth --mca btl ^openib --mca btl_tcp_if_include br0 --mca btl_tcp_port_min_v4 1 ./a.out and every

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-25 Thread Hamid Saeed
Hello, I am not sure what approach does the MPI communication follow but when i use --mca btl_base_verbose 30 I observe the mentioned port. [karp:23756] btl: tcp: attempting to connect() to address 134.106.3.252 on port 4 [karp][[4612,1],0][btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_con

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-25 Thread Reuti
Hi, Am 25.03.2014 um 08:34 schrieb Hamid Saeed: > Is it possible to change the port number for the MPI communication? > > I can see that my program uses port 4 for the MPI communication. > > [karp:23756] btl: tcp: attempting to connect() to address 134.106.3.252 on > port 4 > [karp][[4612,1],0

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-25 Thread Hamid Saeed
Hello, Is it possible to change the port number for the MPI communication? I can see that my program uses port 4 for the MPI communication. [karp:23756] btl: tcp: attempting to connect() to address 134.106.3.252 on port 4 [karp][[4612,1],0][btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_co