Thanks!  I ran the command: 

mpirun --slot-list 0-3 -np 4 --report-bindings $EXECUTABLE:

and this is the output of standard error:

[node50.cl.corp.com:15473] [[45030,0],0] odls:default:fork binding child 
[[45030,1],0] to slot_list 0-3
[node50.cl.corp.com:15473] [[45030,0],0] odls:default:fork binding child 
[[45030,1],1] to slot_list 0-3
[node50.cl.corp.com:15473] [[45030,0],0] odls:default:fork binding child 
[[45030,1],2] to slot_list 0-3
[node50.cl.corp.com:15473] [[45030,0],0] odls:default:fork binding child 
[[45030,1],3] to slot_list 0-3

top shows the first 3 cores are bound:

top - 11:17:06 up 35 days,  1:03,  2 users,  load average: 3.15, 1.15, 0.41
Tasks: 453 total,   6 running, 446 sleeping,   1 stopped,   0 zombie
Cpu0  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8059116k total,  1577220k used,  6481896k free,    62020k buffers
Swap: 16787916k total,    61108k used, 16726808k free,   718036k cached


For a multinode job, rankfile is needed:

http://www.open-mpi.org/faq/?category=tuning#using-paffinity-v1.3

Appreciate the suggestions and solution.



On Jul 16, 2012, at 5:08 PM, Ralph Castain wrote:

> Or you could just do:
> 
> mpirun --slot-list 0-3 -np 4 hostname
> 
> That will put the four procs on the cpu numbers 0-3, which should all be on 
> the first socket
> 
> 
> On Jul 16, 2012, at 3:23 PM, Dominik Goeddeke wrote:
> 
>> in the "old" 1.4.x and 1.5.x, I achieved this by using rankfiles (see FAQ), 
>> and it worked very well. With these versions, --byslot etc. didn't work for 
>> me, I always needed the rankfiles. I haven't tried the overhauled 
>> "convenience wrappers" in 1.6 that you are using for this feature yet, but I 
>> see no reason why the "old" way should not work, although it requires some 
>> shell magic if rankfiles are to be generated automatically from e.g. PBS or 
>> SLURM node lists.
>> 
>> Dominik
>> 
>> On 07/17/2012 12:13 AM, Anne M. Hammond wrote:
>>> There are 2 physical processors, each with 4 cores (no hyperthreading).
>>> 
>>> I want to instruct openmpi to run only on the first processor, using 4 
>>> cores.
>>> 
>>> 
>>> [hammond@node48 ~]$ cat /proc/cpuinfo
>>> processor : 0
>>> vendor_id : AuthenticAMD
>>> cpu family : 16
>>> model : 4
>>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>>> stepping : 2
>>> cpu MHz : 2311.694
>>> cache size : 512 KB
>>> physical id : 0
>>> siblings : 4
>>> core id : 0
>>> cpu cores : 4
>>> apicid : 0
>>> initial apicid : 0
>>> fpu : yes
>>> fpu_exception : yes
>>> cpuid level : 5
>>> wp : yes
>>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 
>>> clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 
>>> 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor 
>>> cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 
>>> 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>>> bogomips : 4623.38
>>> TLB size : 1024 4K pages
>>> clflush size : 64
>>> cache_alignment : 64
>>> address sizes : 48 bits physical, 48 bits virtual
>>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>> 
>>> processor : 1
>>> vendor_id : AuthenticAMD
>>> cpu family : 16
>>> model : 4
>>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>>> stepping : 2
>>> cpu MHz : 2311.694
>>> cache size : 512 KB
>>> physical id : 0
>>> siblings : 4
>>> core id : 1
>>> cpu cores : 4
>>> apicid : 1
>>> initial apicid : 1
>>> fpu : yes
>>> fpu_exception : yes
>>> cpuid level : 5
>>> wp : yes
>>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 
>>> clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 
>>> 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor 
>>> cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 
>>> 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>>> bogomips : 4623.17
>>> TLB size : 1024 4K pages
>>> clflush size : 64
>>> cache_alignment : 64
>>> address sizes : 48 bits physical, 48 bits virtual
>>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>> 
>>> processor : 2
>>> vendor_id : AuthenticAMD
>>> cpu family : 16
>>> model : 4
>>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>>> stepping : 2
>>> cpu MHz : 2311.694
>>> cache size : 512 KB
>>> physical id : 0
>>> siblings : 4
>>> core id : 2
>>> cpu cores : 4
>>> apicid : 2
>>> initial apicid : 2
>>> fpu : yes
>>> fpu_exception : yes
>>> cpuid level : 5
>>> wp : yes
>>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 
>>> clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 
>>> 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor 
>>> cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 
>>> 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>>> bogomips : 4623.19
>>> TLB size : 1024 4K pages
>>> clflush size : 64
>>> cache_alignment : 64
>>> address sizes : 48 bits physical, 48 bits virtual
>>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>> 
>>> processor : 3
>>> vendor_id : AuthenticAMD
>>> cpu family : 16
>>> model : 4
>>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>>> stepping : 2
>>> cpu MHz : 2311.694
>>> cache size : 512 KB
>>> physical id : 0
>>> siblings : 4
>>> core id : 3
>>> cpu cores : 4
>>> apicid : 3
>>> initial apicid : 3
>>> fpu : yes
>>> fpu_exception : yes
>>> cpuid level : 5
>>> wp : yes
>>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 
>>> clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 
>>> 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor 
>>> cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 
>>> 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>>> bogomips : 4623.16
>>> TLB size : 1024 4K pages
>>> clflush size : 64
>>> cache_alignment : 64
>>> address sizes : 48 bits physical, 48 bits virtual
>>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>> 
>>> processor : 4
>>> vendor_id : AuthenticAMD
>>> cpu family : 16
>>> model : 4
>>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>>> stepping : 2
>>> cpu MHz : 2311.694
>>> cache size : 512 KB
>>> physical id : 1
>>> siblings : 4
>>> core id : 0
>>> cpu cores : 4
>>> apicid : 4
>>> initial apicid : 4
>>> fpu : yes
>>> fpu_exception : yes
>>> cpuid level : 5
>>> wp : yes
>>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 
>>> clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 
>>> 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor 
>>> cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 
>>> 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>>> bogomips : 4623.16
>>> TLB size : 1024 4K pages
>>> clflush size : 64
>>> cache_alignment : 64
>>> address sizes : 48 bits physical, 48 bits virtual
>>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>> 
>>> processor : 5
>>> vendor_id : AuthenticAMD
>>> cpu family : 16
>>> model : 4
>>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>>> stepping : 2
>>> cpu MHz : 2311.694
>>> cache size : 512 KB
>>> physical id : 1
>>> siblings : 4
>>> core id : 1
>>> cpu cores : 4
>>> apicid : 5
>>> initial apicid : 5
>>> fpu : yes
>>> fpu_exception : yes
>>> cpuid level : 5
>>> wp : yes
>>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 
>>> clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 
>>> 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor 
>>> cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 
>>> 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>>> bogomips : 4623.16
>>> TLB size : 1024 4K pages
>>> clflush size : 64
>>> cache_alignment : 64
>>> address sizes : 48 bits physical, 48 bits virtual
>>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>> 
>>> processor : 6
>>> vendor_id : AuthenticAMD
>>> cpu family : 16
>>> model : 4
>>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>>> stepping : 2
>>> cpu MHz : 2311.694
>>> cache size : 512 KB
>>> physical id : 1
>>> siblings : 4
>>> core id : 2
>>> cpu cores : 4
>>> apicid : 6
>>> initial apicid : 6
>>> fpu : yes
>>> fpu_exception : yes
>>> cpuid level : 5
>>> wp : yes
>>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 
>>> clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 
>>> 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor 
>>> cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 
>>> 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>>> bogomips : 4623.17
>>> TLB size : 1024 4K pages
>>> clflush size : 64
>>> cache_alignment : 64
>>> address sizes : 48 bits physical, 48 bits virtual
>>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>> 
>>> processor : 7
>>> vendor_id : AuthenticAMD
>>> cpu family : 16
>>> model : 4
>>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>>> stepping : 2
>>> cpu MHz : 2311.694
>>> cache size : 512 KB
>>> physical id : 1
>>> siblings : 4
>>> core id : 3
>>> cpu cores : 4
>>> apicid : 7
>>> initial apicid : 7
>>> fpu : yes
>>> fpu_exception : yes
>>> cpuid level : 5
>>> wp : yes
>>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 
>>> clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 
>>> 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor 
>>> cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 
>>> 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>>> bogomips : 4623.18
>>> TLB size : 1024 4K pages
>>> clflush size : 64
>>> cache_alignment : 64
>>> address sizes : 48 bits physical, 48 bits virtual
>>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>> 
>>> 
>>> On Jul 16, 2012, at 4:09 PM, Elken, Tom wrote:
>>> 
>>>> Anne,
>>>> 
>>>> output from "cat /proc/cpuinfo" on your node "hostname"  may help those 
>>>> trying to answer.
>>>> 
>>>> -Tom
>>>> 
>>>>> -----Original Message-----
>>>>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
>>>>> Behalf Of Ralph Castain
>>>>> Sent: Monday, July 16, 2012 2:47 PM
>>>>> To: Open MPI Users
>>>>> Subject: Re: [OMPI users] openmpi tar.gz for 1.6.1 or 1.6.2
>>>>> 
>>>>> I gather there are two sockets on this node? So the second cmd line is 
>>>>> equivalent
>>>>> to leaving "num-sockets" off of the cmd line?
>>>>> 
>>>>> I haven't tried what you are doing, so it is quite possible this is a bug.
>>>>> 
>>>>> 
>>>>> On Jul 16, 2012, at 1:49 PM, Anne M. Hammond wrote:
>>>>> 
>>>>>> Thanks!
>>>>>> 
>>>>>> Built the latest snapshot.  Still getting an error when trying to run
>>>>>> on only one socket (see below):  Is there a workaround?
>>>>>> 
>>>>>> [hammond@node65 bin]$ ./mpirun -np 4 --num-sockets 1 --npersocket 4
>>>>>> hostname
>>>>>> ----------------------------------------------------------------------
>>>>>> ---- An invalid physical processor ID was returned when attempting to
>>>>>> bind an MPI process to a unique processor.
>>>>>> 
>>>>>> This usually means that you requested binding to more processors than
>>>>>> exist (e.g., trying to bind N MPI processes to M processors, where N >
>>>>>> M).  Double check that you have enough unique processors for all the
>>>>>> MPI processes that you are launching on this host.
>>>>>> 
>>>>>> You job will now abort.
>>>>>> ----------------------------------------------------------------------
>>>>>> ----
>>>>>> ----------------------------------------------------------------------
>>>>>> ---- mpirun was unable to start the specified application as it
>>>>>> encountered an error:
>>>>>> 
>>>>>> Error name: Fatal
>>>>>> Node: node65.cl.corp.com
>>>>>> 
>>>>>> when attempting to start process rank 0.
>>>>>> ----------------------------------------------------------------------
>>>>>> ----
>>>>>> 4 total processes failed to start
>>>>>> 
>>>>>> 
>>>>>> [hammond@node65 bin]$ ./mpirun -np 4 --num-sockets 2 --npersocket 4
>>>>>> hostname node65.cl.corp.com node65.cl.corp.com node65.cl.corp.com
>>>>>> node65.cl.corp.com
>>>>>> [hammond@node65 bin]$
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Jul 16, 2012, at 12:56 PM, Ralph Castain wrote:
>>>>>> 
>>>>>>> Jeff is at the MPI Forum this week, so his answers will be delayed. 
>>>>>>> Last I
>>>>> heard, it was close, but no specific date has been set.
>>>>>>> 
>>>>>>> 
>>>>>>> On Jul 16, 2012, at 11:49 AM, Michael E. Thomadakis wrote:
>>>>>>> 
>>>>>>>> When is the expected date for the official 1.6.1 (or 1.6.2 ?) to be 
>>>>>>>> available ?
>>>>>>>> 
>>>>>>>> mike
>>>>>>>> 
>>>>>>>> On 07/16/2012 01:44 PM, Ralph Castain wrote:
>>>>>>>>> You can get it here:
>>>>>>>>> 
>>>>>>>>> http://www.open-mpi.org/nightly/v1.6/
>>>>>>>>> 
>>>>>>>>> On Jul 16, 2012, at 10:22 AM, Anne M. Hammond wrote:
>>>>>>>>> 
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> For benchmarking, we would like to use openmpi with
>>>>>>>>>> --num-sockets 1
>>>>>>>>>> 
>>>>>>>>>> This fails in 1.6, but Bug Report #3119 indicates it is changed in
>>>>>>>>>> 1.6.1.
>>>>>>>>>> 
>>>>>>>>>> Is 1.6.1 or 1.6.2 available in tar.gz form?
>>>>>>>>>> 
>>>>>>>>>> Thanks!
>>>>>>>>>> Anne
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>> 
>>>>>> Anne M. Hammond - Systems / Network Administration - Tech-X Corp
>>>>>>                 hammond_at_txcorp.com 720-974-1840
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>> 
>>> Anne M. Hammond - Systems / Network Administration - Tech-X Corp
>>>                   hammond_at_txcorp.com 720-974-1840
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> 
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> -- 
>> Jun.-Prof. Dr. Dominik Göddeke
>> Hardware-orientierte Numerik für große Systeme
>> Institut für Angewandte Mathematik (LS III)
>> Fakultät für Mathematik, Technische Universität Dortmund
>> 
>> http://www.mathematik.tu-dortmund.de/~goeddeke
>> 
>> Tel. +49-(0)231-755-7218  Fax +49-(0)231-755-5933
>> --
>> Sent from my old-fashioned computer and not from a mobile device.
>> I proudly boycott 24/7 availability.
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Anne M. Hammond - Systems / Network Administration - Tech-X Corp
                  hammond_at_txcorp.com 720-974-1840




Reply via email to