Re: [OMPI users] 1.6.2 affinity failures

2012-12-20 Thread Brock Palen
w00t :-) Thanks Brock Palen www.umich.edu/~brockp CAEN Advanced Computing bro...@umich.edu (734)936-1985 On Dec 20, 2012, at 10:46 AM, Ralph Castain wrote: > HmmmI'll see what I can do about the error message. I don't think there > is much in 1.6 I can do, but in 1.7 I could generate an

Re: [OMPI users] 1.6.2 affinity failures

2012-12-20 Thread Ralph Castain
HmmmI'll see what I can do about the error message. I don't think there is much in 1.6 I can do, but in 1.7 I could generate an appropriate error message as we have a way to check the topologies. On Dec 20, 2012, at 7:11 AM, Brock Palen wrote: > Ralph, > > Thanks for the info, > That sai

Re: [OMPI users] 1.6.2 affinity failures

2012-12-20 Thread Brock Palen
Ralph, Thanks for the info, That said I found the problem, one of the new nodes, had Hyperthreading on, and the rest didn't so all the nodes didn't match. A quick pdsh lstopo | dshbak -c Uncovered the one different node. The error just didn't give me a clue to that being the cause, which

Re: [OMPI users] 1.6.2 affinity failures

2012-12-19 Thread Ralph Castain
I'm afraid these are both known problems in the 1.6.2 release. I believe we fixed npersocket in 1.6.3, though you might check to be sure. On the large-scale issue, cpus-per-rank well might fail under those conditions. The algorithm in the 1.6 series hasn't seen much use, especially at scale. In

[OMPI users] 1.6.2 affinity failures

2012-12-19 Thread Brock Palen
Using openmpi 1.6.2 with intel 13.0 though the problem not specific to the compiler. Using two 12 core 2 socket nodes, mpirun -np 4 -npersocket 2 uptime -- Your job has requested a conflicting number of processes for the a