You can improve performance by using --bind-to socket or --bind-to numa as
this will keep the process inside the same memory region. You can also help
separate the jobs by using the --cpuset to tell each job which cpus it
should use - we'll stay within that envelope.



On Tue, Aug 12, 2014 at 8:33 AM, Reuti <re...@staff.uni-marburg.de> wrote:

> Am 12.08.2014 um 16:57 schrieb Antonio Rago:
>
> > Brilliant, this works!
> > However I’ve to say that it seems that it seems that code becomes
> slightly less performing.
> > Is there a way to instruct mpirun on which core to use, and maybe create
> this map automatically with grid engine?
>
> In the open source version of SGE the requested core binding is only a
> soft request. The Univa version can handle this as a hard request though,
> as the scheduler will do the assignment and knows which cores are used. I
> have no information whether this will be forwarded to Open MPI
> automatically. I assume not, and it must be read out of the machine file
> (there ought to be an extra column for it in their version) and feed to
> Open MPI by some measures.
>
> -- Reuti
>
>
> > Thanks in advance
> > Antonio
> >
> >
> >
> >
> > On 12 Aug 2014, at 14:10, Jeff Squyres (jsquyres) <jsquy...@cisco.com>
> wrote:
> >
> >> The quick and dirty answer is that in the v1.8 series, Open MPI started
> binding MPI processes to cores by default.
> >>
> >> When you run 2 independent jobs on the same machine in the way in which
> you described, the two jobs won't have knowledge of each other, and
> therefore they will both starting binging MPI processes starting with
> logical core 0.
> >>
> >> The easy workaround is to disable bind-to-core behavior.  For example,
> "mpirun --bind-to none ...".  In this way, the OS will (more or less) load
> balance your MPI jobs to available cores (assuming you don't run more MPI
> processes than cores).
> >>
> >>
> >> On Aug 12, 2014, at 7:05 AM, Antonio Rago <antonio.r...@plymouth.ac.uk>
> wrote:
> >>
> >>> Dear mailing list
> >>> I’m running into trouble in the configuration of the small cluster I’m
> managing.
> >>> I’ve installed openmpi-1.8.1 with gcc 4.7 on a Centos 6.5 with
> infiniband support.
> >>> Compile and installation were all ok and i can compile and actually
> run parallel jobs, both directly or by submitting them with the queue
> manager (gridengine).
> >>> My problem is that when two different subsets of two job end on the
> same node, they will not spread equally and use all the cores of the node
> but instead they will run on a common subset of cores leaving some other
> totally empty.
> >>> For example two 4 core jobs on a 8 core node will result in only 4
> core running on the node (all of them being oversubscribed) and the other 4
> cores being empty.
> >>> Clearly there must be an error in the way I’ve configured stuffs but i
> cannot find any hint on how to solve the problem.
> >>> I’ve tried to do different map (map by core or by slot) but I’ve never
> succeeded.
> >>> Could you give a me suggestion on this issue?
> >>> Regards
> >>> Antonio
> >>>
> >>> ________________________________
> >>> [http://www.plymouth.ac.uk/images/email_footer.gif]<
> http://www.plymouth.ac.uk/worldclass>
> >>>
> >>> This email and any files with it are confidential and intended solely
> for the use of the recipient to whom it is addressed. If you are not the
> intended recipient then copying, distribution or other use of the
> information contained is strictly prohibited and you should not rely on it.
> If you have received this email in error please let the sender know
> immediately and delete it from your system(s). Internet emails are not
> necessarily secure. While we take every care, Plymouth University accepts
> no responsibility for viruses and it is your responsibility to scan emails
> and their attachments. Plymouth University does not accept responsibility
> for any changes made after it was sent. Nothing in this email or its
> attachments constitutes an order for goods or services unless accompanied
> by an official order form.
> >>> _______________________________________________
> >>> users mailing list
> >>> us...@open-mpi.org
> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/08/24986.php
> >>
> >>
> >> --
> >> Jeff Squyres
> >> jsquy...@cisco.com
> >> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>
> >> _______________________________________________
> >> users mailing list
> >> us...@open-mpi.org
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> >> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/08/24991.php
> >
> > ________________________________
> > [http://www.plymouth.ac.uk/images/email_footer.gif]<
> http://www.plymouth.ac.uk/worldclass>
> >
> > This email and any files with it are confidential and intended solely
> for the use of the recipient to whom it is addressed. If you are not the
> intended recipient then copying, distribution or other use of the
> information contained is strictly prohibited and you should not rely on it.
> If you have received this email in error please let the sender know
> immediately and delete it from your system(s). Internet emails are not
> necessarily secure. While we take every care, Plymouth University accepts
> no responsibility for viruses and it is your responsibility to scan emails
> and their attachments. Plymouth University does not accept responsibility
> for any changes made after it was sent. Nothing in this email or its
> attachments constitutes an order for goods or services unless accompanied
> by an official order form.
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/08/24992.php
> >
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/08/24994.php
>

Reply via email to