Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Ralph Castain
Yes, that is correct Ralph On Thu, Mar 27, 2014 at 4:15 PM, Gus Correa wrote: > On 03/27/2014 05:58 PM, Jeff Squyres (jsquyres) wrote: > >> On Mar 27, 2014, at 4:06 PM, "Sasso, John (GE Power & Water, Non-GE)" >> > wrote: > >> >> Yes, I noticed that I could not find --display-map in any of t

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Reuti
Am 27.03.2014 um 23:59 schrieb Dave Love: > Reuti writes: > >> Do all of them have an internal bookkeeping of granted cores to slots >> - i.e. not only the number of scheduled slots per job per node, but >> also which core was granted to which job? Does Open MPI read this >> information would be

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-03-27 Thread Edgar Gabriel
I will resubmit a new patch, Rob sent me a pointer to the correct solution. Its on my to do list for tomorrow/this weekend. Thanks Edgar On 3/27/2014 5:45 PM, Dave Love wrote: > Edgar Gabriel writes: > >> not sure honestly. Basically, as suggested in this email chain earlier, >> I had to disabl

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Gus Correa
On 03/27/2014 05:58 PM, Jeff Squyres (jsquyres) wrote: On Mar 27, 2014, at 4:06 PM, "Sasso, John (GE Power & Water, Non-GE)" wrote: Yes, I noticed that I could not find --display-map in any of the man pages. Intentional? Oops; nope. I'll ask Ralph to add it... Nah ... John: As far as I

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Ralph Castain
Oooh...it's Jeff's fault! Fwiw you can get even more detailed mapping info with --display-devel-map Sent from my iPhone > On Mar 27, 2014, at 2:58 PM, "Jeff Squyres (jsquyres)" > wrote: > >> On Mar 27, 2014, at 4:06 PM, "Sasso, John (GE Power & Water, Non-GE)" >> wrote: >> >> Yes, I no

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Dave Love
Reuti writes: > Do all of them have an internal bookkeeping of granted cores to slots > - i.e. not only the number of scheduled slots per job per node, but > also which core was granted to which job? Does Open MPI read this > information would be the next question then. OMPI works with the bindi

Re: [OMPI users] busy waiting and oversubscriptions

2014-03-27 Thread Lloyd Brown
I don't know about your users, but experience has, unfortunately, taught us to assume that users' jobs are very, very badly-behaved. I choose to assume that it's incompetence on the part of programmers and users, rather than malice, though. :-) Lloyd Brown Systems Administrator Fulton Supercomput

Re: [OMPI users] busy waiting and oversubscriptions

2014-03-27 Thread Dave Love
Gus Correa writes: > On 03/27/2014 05:05 AM, Andreas Schäfer wrote: >>> >Queue systems won't allow resources to be oversubscribed. [Maybe that meant that resource managers can, and typically do, prevent resources being oversubscribed.] >> I'm fairly confident that you can configure Slurm to ove

Re: [OMPI users] busy waiting and oversubscriptions

2014-03-27 Thread Dave Love
Gus Correa writes: > Torque+Maui, SGE/OGE, and Slurm are free. [OGE certainly wasn't free, but it apparently no longer exists -- another thing Oracle screwed up and eventually dumped.] > If you build the queue system with cpuset control, a node can be > shared among several jobs, but the cpus/c

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-03-27 Thread Dave Love
Edgar Gabriel writes: > not sure honestly. Basically, as suggested in this email chain earlier, > I had to disable the PVFS2_IreadContig and PVFS2_IwriteContig routines > in ad_pvfs2.c to make the tests pass. Otherwise the tests worked but > produced wrong data. I did not have however the time to

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Jeff Squyres (jsquyres)
On Mar 27, 2014, at 4:06 PM, "Sasso, John (GE Power & Water, Non-GE)" wrote: > Yes, I noticed that I could not find --display-map in any of the man pages. > Intentional? Oops; nope. I'll ask Ralph to add it... -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http:

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Gus Correa
On 03/27/2014 04:10 PM, Reuti wrote: Hi, Am 27.03.2014 um 20:15 schrieb Gus Correa: Awesome, but now here is my concern. If we have OpenMPI-based applications launched as batch jobs via a batch scheduler like SLURM, PBS, LSF, etc. (which decides the placement of the app and dispatches it to

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Reuti
Hi, Am 27.03.2014 um 20:15 schrieb Gus Correa: > >> Awesome, but now here is my concern. > If we have OpenMPI-based applications launched as batch jobs > via a batch scheduler like SLURM, PBS, LSF, etc. > (which decides the placement of the app and dispatches it to the compute > hosts), > then

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Sasso, John (GE Power & Water, Non-GE)
Yes, I noticed that I could not find --display-map in any of the man pages. Intentional? -Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa Sent: Thursday, March 27, 2014 3:26 PM To: Open MPI Users Subject: Re: [OMPI users] Mapping ranks to hosts

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Gus Correa
On 03/27/2014 03:02 PM, Ralph Castain wrote: Or use --display-map to see the process to node assignments Aha! That one was not on my radar. Maybe because somehow I can't find it in the OMPI 1.6.5 mpiexec man page. However, it seems to work with that version also, which is great. (--display-map

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Sasso, John (GE Power & Water, Non-GE)
Thank you! That also works and is very helpful. -Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: Thursday, March 27, 2014 3:03 PM To: Open MPI Users Subject: Re: [OMPI users] Mapping ranks to hosts (from MPI error messages) Or use --disp

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Gus Correa
Hi John I just set a PS message ... On 03/27/2014 02:41 PM, Sasso, John (GE Power & Water, Non-GE) wrote: Thank you, Gus! I did go through the mpiexec/mpirun man pages but wasn't quite clear that -report-bindings was what I was looking for. So what I did is rerun a program w/ --report-binding

Re: [OMPI users] How to replace --cpus-per-proc by --map-by

2014-03-27 Thread Ralph Castain
Agreed - Jeff and I discussed this just this morning. I will be updating FAQ soon Sent from my iPhone > On Mar 27, 2014, at 9:24 AM, Gus Correa wrote: > > <\begin hijacking this thread> > > I second Saliya's thanks to Tetsuya. > I've been following this thread, to learn a bit more about > how

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Ralph Castain
Or use --display-map to see the process to node assignments Sent from my iPhone > On Mar 27, 2014, at 11:47 AM, Gus Correa wrote: > > PS - The (OMPI 1.6.5) mpiexec default is -bind-to-none, > in which case -report-bindings won't report anything. > > So, if you are using the default, > you can

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Gus Correa
PS - The (OMPI 1.6.5) mpiexec default is -bind-to-none, in which case -report-bindings won't report anything. So, if you are using the default, you can apply Joe Landman's suggestion (or alternatively use the MPI_Get_processor_name function, in lieu of uname(&uts); cpu_name = uts.nodename; ). Ho

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Sasso, John (GE Power & Water, Non-GE)
Thank you, Gus! I did go through the mpiexec/mpirun man pages but wasn't quite clear that -report-bindings was what I was looking for. So what I did is rerun a program w/ --report-bindings but no bindings were reported. Scratching my head, I decided to include --bind-to-core as well. Voila,

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Gus Correa
Hi John Take a look at the mpiexec/mpirun options: -report-bindings (this one should report what you want) and maybe also also: -bycore, -bysocket, -bind-to-core, -bind-to-socket, ... and similar, if you want more control on where your MPI processes run. "man mpiexec" is your friend! I hope

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Joe Landman
On 03/27/2014 01:53 PM, Sasso, John (GE Power & Water, Non-GE) wrote: When a piece of software built against OpenMPI fails, I will see an error referring to the rank of the MPI task which incurred the failure. For example: MPI_ABORT was invoked on rank 1236 in communicator MPI_COMM_WORLD with e

[OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Sasso, John (GE Power & Water, Non-GE)
When a piece of software built against OpenMPI fails, I will see an error referring to the rank of the MPI task which incurred the failure. For example: MPI_ABORT was invoked on rank 1236 in communicator MPI_COMM_WORLD with errorcode 1. Unfortunately, I do not have access to the software code,

Re: [OMPI users] How to replace --cpus-per-proc by --map-by

2014-03-27 Thread Gus Correa
<\begin hijacking this thread> I second Saliya's thanks to Tetsuya. I've been following this thread, to learn a bit more about how to use hardware locality with OpenMPI effectively. [I am still using "--bycore"+"--bind-to-core" in most cases, and "--cpus-per-proc" occasionally when in hybrid MPI+

Re: [OMPI users] busy waiting and oversubscriptions

2014-03-27 Thread Reuti
Am 27.03.2014 um 16:31 schrieb Gus Correa: > On 03/27/2014 05:05 AM, Andreas Schäfer wrote: >>> >Queue systems won't allow resources to be oversubscribed. >> I'm fairly confident that you can configure Slurm to oversubscribe >> nodes: just specify more cores for a node than are actually present. >

Re: [OMPI users] How to replace --cpus-per-proc by --map-by

2014-03-27 Thread Saliya Ekanayake
Thank you, this is really helpful. Saliya On Thu, Mar 27, 2014 at 5:11 AM, wrote: > > > Mapping and binding is related to so called process affinity. > It's a bit difficult for me to explain ... > > So please see this URL below(especially the first half part > of it - from 1 to 20 pages): > >

Re: [OMPI users] busy waiting and oversubscriptions

2014-03-27 Thread Gus Correa
On 03/27/2014 05:05 AM, Andreas Schäfer wrote: >Queue systems won't allow resources to be oversubscribed. I'm fairly confident that you can configure Slurm to oversubscribe nodes: just specify more cores for a node than are actually present. That is true. If you lie to the queue system about

[OMPI users] Hamster

2014-03-27 Thread madhurima madhunapanthula
Hi, I came across Hamster while reading some article on Hadoop + OpenMPI please let me know if the sources of Hamster are available for build and testing. -- Lokah samasta sukhinobhavanthu Thanks, Madhurima

Re: [OMPI users] busy waiting and oversubscriptions

2014-03-27 Thread Thomas Heller
On 03/27/2014 10:19 AM, Andreas Schäfer wrote: On 14:26 Wed 26 Mar , Ross Boylan wrote: [Main part is at the bottom] On Wed, 2014-03-26 at 19:28 +0100, Andreas Schäfer wrote: If you have a complex workflow with varying computational loads, then you might want to take a look at runtime syste

Re: [OMPI users] busy waiting and oversubscriptions

2014-03-27 Thread Andreas Schäfer
On 14:26 Wed 26 Mar , Ross Boylan wrote: > [Main part is at the bottom] > On Wed, 2014-03-26 at 19:28 +0100, Andreas Schäfer wrote: > > If you have a complex workflow with varying computational loads, then > > you might want to take a look at runtime systems which allow you to > > express this

Re: [OMPI users] How to replace --cpus-per-proc by --map-by

2014-03-27 Thread tmishima
Mapping and binding is related to so called process affinity. It's a bit difficult for me to explain ... So please see this URL below(especially the first half part of it - from 1 to 20 pages): http://www.slideshare.net/jsquyres/open-mpi-explorations-in-process-affinity-eurompi13-presentation A

Re: [OMPI users] busy waiting and oversubscriptions

2014-03-27 Thread Andreas Schäfer
Heya, On 19:21 Wed 26 Mar , Gus Correa wrote: > On 03/26/2014 05:26 PM, Ross Boylan wrote: > > [Main part is at the bottom] > > On Wed, 2014-03-26 at 19:28 +0100, Andreas Schäfer wrote: > >> On 09:08 Wed 26 Mar , Ross Boylan wrote: > >>> Second, we do not operate in a batch queuing environ

Re: [OMPI users] How to replace --cpus-per-proc by --map-by

2014-03-27 Thread Saliya Ekanayake
Thank you Tetsuya - it worked. Btw. what's the difference between mapping and binding? I think I am bit confused here. Thank you, Saliya On Thu, Mar 27, 2014 at 4:19 AM, wrote: > > > Hi Saliya, > > What you want to do is map-by node. So please try below: > > -np 2 --map-by node:pe=4 --bind-to

Re: [OMPI users] How to replace --cpus-per-proc by --map-by

2014-03-27 Thread tmishima
Hi Saliya, What you want to do is map-by node. So please try below: -np 2 --map-by node:pe=4 --bind-to core You might not need to add --bind-to core, because it's default binding. Tetsuya > Hi, > > I see in v.1.7.5rc5 --cpus-per-proc is deprecated and is advised to replace by --map-by :PE=N.

[OMPI users] How to replace --cpus-per-proc by --map-by

2014-03-27 Thread Saliya Ekanayake
Hi, I see in v.1.7.5rc5 --cpus-per-proc is deprecated and is advised to replace by --map-by :PE=N. I've tried this but I couldn't get the expected allocation of procs. For example I was running 2 procs on 2 nodes each with 2 sockets where a socket has 4 cores. I wanted 1 proc per node and bound t