> On Mar 25, 2016, at 12:53 PM, Ronald Cohen <recoh...@gmail.com> wrote: > > Should > -bind-to-core > also help?
No - if you specify pe=N, then you will automatically bind to core > Does the error I get matter? Should we install libnumactl > and libnumactl-devel packages. ? Thanks! Yes! The warning you are getting is telling you that memory may not be bound local to your process - which really can hurt performance. > > Ron > > --- > Ron Cohen > recoh...@gmail.com > skypename: ronaldcohen > twitter: @recohen3 > > > On Fri, Mar 25, 2016 at 3:43 PM, Ralph Castain <r...@open-mpi.org> wrote: >> Yeah, it can really have an impact! It is unfortunately highly >> application-specific, so all we can do is provide the tools. >> >> As you can see from the binding map, we are tight packing the procs on each >> node to maximize the use of shared memory. However, this assumes that each >> rank is predominantly going to “talk” to rank+/-1 - i.e., the pattern >> involves nearest neighboring ranks. If that isn’t true (e.g., the lowest >> ranked process on one node talks to the the lowest ranked process on the >> next node, etc.), then this would be a bad mapping for performance. >> >> In that case, you can use the “rank-by” option to maintain the location and >> binding, but change the assigned MCW ranks to align with your communication >> pattern. >> >> HTH >> Ralph >> >> >> >> On Mar 25, 2016, at 12:28 PM, Ronald Cohen <recoh...@gmail.com> wrote: >> >> So I have been experimenting with different mappings, and performance >> varies a lot. The best I find is: >> -map-by slot:pe=2 -np 32 >> with 2 threads >> which gives >> [n001.cluster.com:29647] MCW rank 0 bound to socket 0[core 0[hwt 0]], >> socket 0[core 1[hwt 0]]: [B/B/./././././.][./././././././.] >> [n001.cluster.com:29647] MCW rank 1 bound to socket 0[core 2[hwt 0]], >> socket 0[core 3[hwt 0]]: [././B/B/./././.][./././././././.] >> [n001.cluster.com:29647] MCW rank 2 bound to socket 0[core 4[hwt 0]], >> socket 0[core 5[hwt 0]]: [././././B/B/./.][./././././././.] >> [n001.cluster.com:29647] MCW rank 3 bound to socket 0[core 6[hwt 0]], >> socket 0[core 7[hwt 0]]: [././././././B/B][./././././././.] >> [n001.cluster.com:29647] MCW rank 4 bound to socket 1[core 8[hwt 0]], >> socket 1[core 9[hwt 0]]: [./././././././.][B/B/./././././.] >> [n001.cluster.com:29647] MCW rank 5 bound to socket 1[core 10[hwt 0]], >> socket 1[core 11[hwt 0]]: [./././././././.][././B/B/./././.] >> [n001.cluster.com:29647] MCW rank 6 bound to socket 1[core 12[hwt 0]], >> socket 1[core 13[hwt 0]]: [./././././././.][././././B/B/./.] >> [n001.cluster.com:29647] MCW rank 7 bound to socket 1[core 14[hwt 0]], >> socket 1[core 15[hwt 0]]: [./././././././.][././././././B/B] >> [n003.cluster.com:29842] MCW rank 16 bound to socket 0[core 0[hwt 0]], >> socket 0[core 1[hwt 0]]: [B/B/./././././.][./././././././.] >> [n002.cluster.com:32210] MCW ra >> ... >> >> --- >> Ron Cohen >> recoh...@gmail.com >> skypename: ronaldcohen >> twitter: @recohen3 >> >> >> On Fri, Mar 25, 2016 at 3:13 PM, Ronald Cohen <recoh...@gmail.com> wrote: >> >> So >> -map-by node:pe=2 -np 32 >> runs and gives great performance, though a little worse than -n 32 >> it puts the correct number of processes, but does do round robin. Is >> there a way to do this without the round robin? Also note the error >> message: >> >> >> ====================== ALLOCATED NODES ====================== >> n001: slots=16 max_slots=0 slots_inuse=0 state=UP >> n004.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP >> n003.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP >> n002.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP >> ================================================================= >> -------------------------------------------------------------------------- >> WARNING: a request was made to bind a process. While the system >> supports binding the process itself, at least one node does NOT >> support binding memory to the process location. >> >> Node: n001 >> >> This usually is due to not having the required NUMA support installed >> on the node. In some Linux distributions, the required support is >> contained in the libnumactl and libnumactl-devel packages. >> This is a warning only; your job will continue, though performance may >> be degraded. >> -------------------------------------------------------------------------- >> [n001.cluster.com:29316] MCW rank 0 bound to socket 0[core 0[hwt 0]], >> socket 0[core 1[hwt 0]]: [B/B/./././././.][./././././././.] >> [n001.cluster.com:29316] MCW rank 4 bound to socket 0[core 2[hwt 0]], >> socket 0[core 3[hwt 0]]: [././B/B/./././.][./././././././.] >> [n001.cluster.com:29316] MCW rank 8 bound to socket 0[core 4[hwt 0]], >> socket 0[core 5[hwt 0]]: [././././B/B/./.][./././././././.] >> [n001.cluster.com:29316] MCW rank 12 bound to socket 0[core 6[hwt 0]], >> socket 0[core 7[hwt 0]]: [././././././B/B][./././././././.] >> [n001.cluster.com:29316] MCW rank 16 bound to socket 1[core 8[hwt 0]], >> socket 1[core 9[hwt 0]]: [./././././././.][B/B/./././././.] >> [n001.cluster.com:29316] MCW rank 20 bound to socket 1[core 10[hwt >> 0]], socket 1[core 11[hwt 0]]: [./././././././.][././B/B/./././.] >> [n001.cluster.com:29316] MCW rank 24 bound to socket 1[core 12[hwt >> 0]], socket 1[core 13[hwt 0]]: [./././././././.][././././B/B/./.] >> [n001.cluster.com:29316] MCW rank 28 bound to socket 1[core 14[hwt >> 0]], socket 1[core 15[hwt 0]]: [./././././././.][././././././B/B] >> [n003.cluster.com:29704] MCW rank 22 bound to socket 1[core 10[hwt >> 0]], socket 1[core 11[hwt 0]]: [./././././././.][././B/B/./././.] >> --- >> Ron Cohen >> recoh...@gmail.com >> skypename: ronaldcohen >> twitter: @recohen3 >> >> >> On Fri, Mar 25, 2016 at 2:32 PM, Ronald Cohen <recoh...@gmail.com> wrote: >> >> So it seems my >> -map-by core:pe=2 -n 32 >> should have worked . I would have 32 procs with 2 on each, giving 64 total. >> But it doesn't >> --- >> Ron Cohen >> recoh...@gmail.com >> skypename: ronaldcohen >> twitter: @recohen3 >> >> >> On Fri, Mar 25, 2016 at 2:19 PM, Ralph Castain <r...@open-mpi.org> wrote: >> >> pe=N tells us to map N cores (we call them “processing elements” because >> they could be HTs if you —use-hwthreads-as-cpus) to each process. So we will >> bind each process to N cores. >> >> So if you want 16 procs, each with two processing elements assigned to them >> (which is a good choice if you are using 2 threads/process), then you would >> use: >> >> mpirun -map-by core:pe=2 -np 16 >> >> If you add -report-bindings, you’ll see each process bound to two cores, >> with the procs tightly packed on each node until that node’s cores are fully >> utilized. We do handle the unlikely event that you asked for a non-integer >> multiple of cores - i.e., if you have 32 cores on a node, and you ask for >> pe=6, we will wind up leaving two cores idle. >> >> HTH >> Ralph >> >> On Mar 25, 2016, at 11:11 AM, Ronald Cohen <recoh...@gmail.com> wrote: >> >> or is it mpirun -map-by core:pe=8 -n 16 ? >> >> --- >> Ron Cohen >> recoh...@gmail.com >> skypename: ronaldcohen >> twitter: @recohen3 >> >> >> On Fri, Mar 25, 2016 at 2:10 PM, Ronald Cohen <recoh...@gmail.com> wrote: >> >> Thank you--I looked on the man page and it is not clear to me what >> pe=2 does. Is that the number of threads? So if I want 16 mpi procs >> with 2 threads is it for 32 cores (two nodes) >> >> mpirun -map-by core:pe=2 -n 16 >> >> ? >> >> Sorry if I mangled this. >> >> >> Ron >> >> --- >> Ron Cohen >> recoh...@gmail.com >> skypename: ronaldcohen >> twitter: @recohen3 >> >> >> On Fri, Mar 25, 2016 at 2:03 PM, Ralph Castain <r...@open-mpi.org> wrote: >> >> Okay, what I would suggest is that you use the following cmd line: >> >> mpirun -map-by core:pe=2 (or 8 or whatever number you want) >> >> This should give you the best performance as it will tight-pack the procs >> and assign them to the correct number of cores. See if that helps >> >> On Mar 25, 2016, at 10:38 AM, Ronald Cohen <recoh...@gmail.com> wrote: >> >> 1.10.2 >> >> Ron >> >> --- >> Ron Cohen >> recoh...@gmail.com >> skypename: ronaldcohen >> twitter: @recohen3 >> >> >> On Fri, Mar 25, 2016 at 1:30 PM, Ralph Castain <r...@open-mpi.org> wrote: >> >> Hmmm…what version of OMPI are you using? >> >> >> On Mar 25, 2016, at 10:27 AM, Ronald Cohen <recoh...@gmail.com> wrote: >> >> --report-bindings didn't report anything >> --- >> Ron Cohen >> recoh...@gmail.com >> skypename: ronaldcohen >> twitter: @recohen3 >> >> >> On Fri, Mar 25, 2016 at 1:24 PM, Ronald Cohen <recoh...@gmail.com> wrote: >> >> —display-allocation an >> didn't seem to give useful information: >> >> ====================== ALLOCATED NODES ====================== >> n005: slots=16 max_slots=0 slots_inuse=0 state=UP >> n008.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP >> n007.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP >> n006.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP >> ================================================================= >> >> for >> mpirun -display-allocation --map-by ppr:8:node -n 32 >> >> Ron >> >> --- >> Ron Cohen >> recoh...@gmail.com >> skypename: ronaldcohen >> twitter: @recohen3 >> >> >> On Fri, Mar 25, 2016 at 1:17 PM, Ronald Cohen <recoh...@gmail.com> wrote: >> >> Actually there was the same number of procs per node in each case. I >> verified this by logging into the nodes while they were running--in >> both cases 4 per node . >> >> Ron >> >> --- >> Ron Cohen >> recoh...@gmail.com >> skypename: ronaldcohen >> twitter: @recohen3 >> >> >> On Fri, Mar 25, 2016 at 1:14 PM, Ralph Castain <r...@open-mpi.org> wrote: >> >> >> On Mar 25, 2016, at 9:59 AM, Ronald Cohen <recoh...@gmail.com> wrote: >> >> It is very strange but my program runs slower with any of these >> choices than if IO simply use: >> >> mpirun -n 16 >> with >> #PBS -l >> nodes=n013.cluster.com:ppn=4+n014.cluster.com:ppn=4+n015.cluster.com:ppn=4+n016.cluster.com:ppn=4 >> for example. >> >> >> This command will tightly pack as many procs as possible on a node - note >> that we may well not see the PBS directives regarding number of ppn. Add >> —display-allocation and let’s see how many slots we think were assigned on >> each node >> >> >> The timing for the latter is 165 seconds, and for >> #PBS -l nodes=4:ppn=16,pmem=1gb >> mpirun --map-by ppr:4:node -n 16 >> it is 368 seconds. >> >> >> It will typically be faster if you pack more procs/node as they can use >> shared memory for communication. >> >> >> Ron >> >> --- >> Ron Cohen >> recoh...@gmail.com >> skypename: ronaldcohen >> twitter: @recohen3 >> >> >> On Fri, Mar 25, 2016 at 12:43 PM, Ralph Castain <r...@open-mpi.org> wrote: >> >> >> On Mar 25, 2016, at 9:40 AM, Ronald Cohen <recoh...@gmail.com> wrote: >> >> Thank you! I will try it! >> >> >> What would >> -cpus-per-proc 4 -n 16 >> do? >> >> >> This would bind each process to 4 cores, filling each node with procs until >> the cores on that node were exhausted, to a total of 16 processes within the >> allocation. >> >> >> Ron >> --- >> Ron Cohen >> recoh...@gmail.com >> skypename: ronaldcohen >> twitter: @recohen3 >> >> >> On Fri, Mar 25, 2016 at 12:38 PM, Ralph Castain <r...@open-mpi.org> wrote: >> >> Add -rank-by node to your cmd line. You’ll still get 4 procs/node, but they >> will be ranked by node instead of consecutively within a node. >> >> >> >> On Mar 25, 2016, at 9:30 AM, Ronald Cohen <recoh...@gmail.com> wrote: >> >> I am using >> >> mpirun --map-by ppr:4:node -n 16 >> >> and this loads the processes in round robin fashion. This seems to be >> twice as slow for my code as loading them node by node, 4 processes >> per node. >> >> How can I not load them round robin, but node by node? >> >> Thanks! >> >> Ron >> >> >> --- >> Ron Cohen >> recoh...@gmail.com >> skypename: ronaldcohen >> twitter: @recohen3 >> >> --- >> Ronald Cohen >> Geophysical Laboratory >> Carnegie Institution >> 5251 Broad Branch Rd., N.W. >> Washington, D.C. 20015 >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28828.php >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28829.php >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28830.php >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28831.php >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28832.php >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28833.php >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28837.php >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28840.php >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28843.php >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28844.php >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28846.php >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28847.php >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28851.php >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/03/28852.php > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/03/28853.php