Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread Ralph Castain
Guess I don't see why modifying the allocation is required - we have mapping options that should support such things. If you specify the total number of procs you want, and cpus-per-proc=4, it should do the same thing I would think. You'd get 2 procs on the 8 slot nodes, 8 on the 32 proc nodes,

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread tmishima
Our cluster consists of three types of nodes. They have 8, 32 and 64 slots respectively. Since the performance of each core is almost same, mixed use of these nodes is possible. Furthremore, in this case, for hybrid application with openmpi+openmp, the modification of hostfile is necesarry as fo

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread Ralph Castain
Why do it the hard way? I'll look at the FAQ because that definitely isn't a recommended thing to do - better to use -host to specify the subset, or just specify the desired mapping using all the various mappers we provide. On Nov 13, 2013, at 6:39 PM, tmish...@jcity.maeda.co.jp wrote: > > >

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread tmishima
Sorry for cross-post. Nodefile is very simple which consists of 8 lines: node08 node08 node08 node08 node08 node08 node08 node08 Therefore, NPROCS=8 My aim is to modify the allocation as you pointed out. According to Openmpi FAQ, proper subset of the hosts allocated to the Torque / PBS Pro jo

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread tmishima
Sorry, I forgot to tell you. Nodefile is very simple which consists of 8 lines: node08 node08 node08 node08 node08 node08 node08 node08 tmishima On Nov 13, 2013, at 4:43 PM, tmish...@jcity.maeda.co.jp wrote: > > > > > > > Yes, the node08 has 8 slots but the process I run is also 8. > > > > #

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread Ralph Castain
Please - can you answer my question on script2? What is the value of NPROCS? Why would you want to do it this way? Are you planning to modify the allocation?? That generally is a bad idea as it can confuse the system On Nov 13, 2013, at 5:55 PM, tmish...@jcity.maeda.co.jp wrote: > > > Since

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread tmishima
Since what I really want is to run script2 correctly, please let us concentrate script2. I'm not an expert of the inside of openmpi. What I can do is just obsabation from the outside. I doubt these lines are strange, especially the last one. [node08.cluster:26952] mca:rmaps:rr: mapping job [565

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread Ralph Castain
On Nov 13, 2013, at 4:43 PM, tmish...@jcity.maeda.co.jp wrote: > > > Yes, the node08 has 8 slots but the process I run is also 8. > > #PBS -l nodes=node08:ppn=8 > > Therefore, I think it should allow this allocation. Is that right? Correct > > My question is why scritp1 works and script2 d

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread tmishima
Yes, the node08 has 8 slots but the process I run is also 8. #PBS -l nodes=node08:ppn=8 Therefore, I think it should allow this allocation. Is that right? My question is why scritp1 works and script2 does not. They are almost same. #PBS -l nodes=node08:ppn=8 export OMP_NUM_THREADS=1 cd $PBS_

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread Ralph Castain
I guess here's my confusion. If you are using only one node, and that node has 8 allocated slots, then we will not allow you to run more than 8 processes on that node unless you specifically provide the --oversubscribe flag. This is because you are operating in a managed environment (in this cas

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread Ralph Castain
It has nothing to do with LAMA as you aren't using that mapper. How many nodes are in this allocation? On Nov 13, 2013, at 4:06 PM, tmish...@jcity.maeda.co.jp wrote: > > > Hi Ralph, this is an additional information. > > Here is the main part of output by adding "-mca rmaps_base_verbose 50".

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread tmishima
Hi Ralph, this is an additional information. Here is the main part of output by adding "-mca rmaps_base_verbose 50". [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm creating map [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm

Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-11-13 Thread George Bosilca
On Nov 13, 2013, at 22:41 , George Bosilca wrote: > > On Nov 13, 2013, at 21:05 , Jeff Squyres (jsquyres) > wrote: > >> On Nov 12, 2013, at 4:25 PM, George Bosilca wrote: >> However, the key here is that MPI_STATUS_SIZE is set to be the size of a ***C*** MPI_Status (but expresse

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread tmishima
Hi Ralph, This is the result of adding -mca ras_base_verbose 50. SCRIPT: mpirun -machinefile pbs_hosts -np ${NPROCS} -report-bindings -bind-to core \ -mca ras_base_verbose 50 -mca plm_base_verbose 5 ./mPre OUTPUT: [node08.cluster:26770] mca:base:select:( plm) Querying component [rsh] [

[OMPI users] Open MPI @SC next week

2013-11-13 Thread Jeff Squyres (jsquyres)
I'm sure everyone reading this email will be in Denver at SC'13 next week (http://sc13.supercomputing.org/). Right? Of course! Many of us from the Open MPI community will be there, and we'd love to chat with real, honest-to-goodness users, admins, and developers who are using Open MPI. Come

Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-11-13 Thread George Bosilca
On Nov 13, 2013, at 21:05 , Jeff Squyres (jsquyres) wrote: > On Nov 12, 2013, at 4:25 PM, George Bosilca wrote: > >>> However, the key here is that MPI_STATUS_SIZE is set to be the size of a >>> ***C*** MPI_Status (but expressed in units of Fortran INTEGER size -- so in >>> the sizeof(int)==

Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-11-13 Thread Jeff Squyres (jsquyres)
On Nov 12, 2013, at 4:25 PM, George Bosilca wrote: >> However, the key here is that MPI_STATUS_SIZE is set to be the size of a >> ***C*** MPI_Status (but expressed in units of Fortran INTEGER size -- so in >> the sizeof(int)==sizeof(INTEGER)==4 case, MPI_STATUS_SIZE is 6. But in the >> sizeof

Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-11-13 Thread Jeff Squyres (jsquyres)
FWIW: George and I (and others) will be at SC next week, and reply latency will be high. The US Thanksgiving holiday is the week after that, so reply latency might still be pretty high/nonexistant that week, too. Just a heads-up... On Nov 13, 2013, at 2:49 PM, Jim Parker wrote: > All, >

Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-11-13 Thread Jim Parker
All, I appreciate your help here. I'm traveling all this week and next. I'll forward these comments to some members of my team, but I won't be able to test/look at anything specific to the HPC configuration until I get back. I can say that during my troubleshooting, I did determine that MPI_STA

Re: [OMPI users] Mpirun performance varies changing the hostfile with equivalent configuration.

2013-11-13 Thread Ralph Castain
When you specify slots=16, you are no longer oversubscribed - and so we don't back down the MPI aggressiveness on messaging. When you are oversubscribed, we have each MPI proc release its schedule slice back to the OS when it's waiting for a message. Overloaded and aggressive = bad performance.

[OMPI users] Mpirun performance varies changing the hostfile with equivalent configuration.

2013-11-13 Thread Iván Cores González
Hi, I am running the NAS parallel benchmarks and I have a performance problem depending on the hostfile configuration. I use Open MPI version 1.7.2. I run the FT benchmark in 16 processes, but I want to overload each core with 4 processes (yes, I want to do it), so I execute: time mpirun --hostfi

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread Ralph Castain
Hmmm...looks like we aren't getting your allocation. Can you rerun and add -mca ras_base_verbose 50? On Nov 12, 2013, at 11:30 PM, tmish...@jcity.maeda.co.jp wrote: > > > Hi Ralph, > > Here is the output of "-mca plm_base_verbose 5". > > [node08.cluster:23573] mca:base:select:( plm) Queryin

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread tmishima
Hi Ralph, Here is the output of "-mca plm_base_verbose 5". [node08.cluster:23573] mca:base:select:( plm) Querying component [rsh] [node08.cluster:23573] [[INVALID],INVALID] plm:rsh_lookup on agent /usr/bin/rsh path NULL [node08.cluster:23573] mca:base:select:( plm) Query of component [rsh] se

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread tmishima
Hi Ralph, Okey, I can help you. Please give me some time to report the output. Tetsuya Mishima > I can try, but I have no way of testing Torque any more - so all I can do is a code review. If you can build --enable-debug and add -mca plm_base_verbose 5 to your cmd line, I'd appreciate seeing t

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread Ralph Castain
I can try, but I have no way of testing Torque any more - so all I can do is a code review. If you can build --enable-debug and add -mca plm_base_verbose 5 to your cmd line, I'd appreciate seeing the output. On Nov 12, 2013, at 9:58 PM, tmish...@jcity.maeda.co.jp wrote: > > > Hi Ralph, > >

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

2013-11-13 Thread tmishima
Hi Ralph, Thank you for your quick response. I'd like to report one more regressive issue about Torque support of openmpi-1.7.4a1r29646, which might be related to "#3893: LAMA mapper has problems" I reported a few days ago. The script below does not work with openmpi-1.7.4a1r29646, although it