Re: [OMPI users] Performance Issues on SMP Workstation

niethammer Thu, 02 Feb 2017 03:10:53 -0800

Hello Andy,

You can also use the --report-bindings option of mpirun to check which cores 
your program will use and to which cores the processes are bound to.


Are you using the same backend compiler on both systems?

Do you have performance tools available on the systems where you can see in 
which part of the Program the time is lost? Common tools would be Score-P/
Vampir/CUBE, TAU, extrae/Paraver.

Best
Christoph

On Wednesday, 1 February 2017 21:09:28 CET Andy Witzig wrote:
> Thank you, Bennet.  From my testing, I?ve seen that the application usually
> performs better at much smaller ranks on the workstation.  I?ve tested on
> the cluster and do not see the same response (i.e. see better performance
> with ranks of -np 15 or 20).   The workstation is not shared and is not
> doing any other work.  I ran the application on the Workstation with top
> and confirmed that 20 procs were fully loaded.
> 
> I?ll look into the diagnostics you mentioned and get back with you.
> 
> Best regards,
> Andy
> 
> On Feb 1, 2017, at 6:15 PM, Bennet Fauber <ben...@umich.edu> wrote:
> 
> How do they compare if you run a much smaller number of ranks, say -np 2 or
> 4?
> 
> Is the workstation shared and doing any other work?
> 
> You could insert some diagnostics into your script, for example,
> uptime and free, both before and after running your MPI program and
> compare.
> 
> You could also run top in batch mode in the background for your own
> username, then run your MPI program, and compare the results from top.
> We've seen instances where the MPI ranks only get distributed to a
> small number of processors, which you see if they all have small
> percentages of CPU.
> 
> Just flailing in the dark...
> 
> -- bennet
> 
> On Wed, Feb 1, 2017 at 6:36 PM, Andy Witzig <cap1...@icloud.com> wrote:
> > Thank for the idea.  I did the test and only get a single host.
> > 
> > Thanks,
> > Andy
> > 
> > On Feb 1, 2017, at 5:04 PM, r...@open-mpi.org wrote:
> > 
> > Simple test: replace your executable with ?hostname?. If you see multiple
> > hosts come out on your cluster, then you know why the performance is
> > different.
> > 
> > On Feb 1, 2017, at 2:46 PM, Andy Witzig <cap1...@icloud.com> wrote:
> > 
> > Honestly, I?m not exactly sure what scheme is being used.  I am using the
> > default template from Penguin Computing for job submission.  It looks
> > like:
> > 
> > #PBS -S /bin/bash
> > #PBS -q T30
> > #PBS -l walltime=24:00:00,nodes=1:ppn=20
> > #PBS -j oe
> > #PBS -N test
> > #PBS -r n
> > 
> > mpirun $EXECUTABLE $INPUT_FILE
> > 
> > I?m not configuring OpenMPI anywhere else. It is possible the Penguin
> > Computing folks have pre-configured my MPI environment.  I?ll see what I
> > can find.
> > 
> > Best regards,
> > Andy
> > 
> > On Feb 1, 2017, at 4:32 PM, Douglas L Reeder <d...@centurylink.net> wrote:
> > 
> > Andy,
> > 
> > What allocation scheme are you using on the cluster. For some codes we see
> > noticeable differences using fillup vs round robin, not 4x though. Fillup
> > is more shared memory use while round robin uses more infinniband.
> > 
> > Doug
> > 
> > On Feb 1, 2017, at 3:25 PM, Andy Witzig <cap1...@icloud.com> wrote:
> > 
> > Hi Tom,
> > 
> > The cluster uses an Infiniband interconnect.  On the cluster I?m
> > requesting: #PBS -l walltime=24:00:00,nodes=1:ppn=20.  So technically,
> > the run on the cluster should be SMP on the node, since there are 20
> > cores/node.  On the workstation I?m just using the command: mpirun -np 20
> > ?. I haven?t finished setting Torque/PBS up yet.
> > 
> > Best regards,
> > Andy
> > 
> > On Feb 1, 2017, at 4:10 PM, Elken, Tom <tom.el...@intel.com> wrote:
> > 
> > For this case:  " a cluster system with 2.6GHz Intel Haswell with 20 cores
> > / node and 128GB RAM/node.  "
> > 
> > are you running 5 ranks per node on 4 nodes?
> > What interconnect are you using for the cluster?
> > 
> > -Tom
> > 
> > -----Original Message-----
> > From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Andrew
> > Witzig
> > Sent: Wednesday, February 01, 2017 1:37 PM
> > To: Open MPI Users
> > Subject: Re: [OMPI users] Performance Issues on SMP Workstation
> > 
> > By the way, the workstation has a total of 36 cores / 72 threads, so using
> > mpirun
> > -np 20 is possible (and should be equivalent) on both platforms.
> > 
> > Thanks,
> > cap79
> > 
> > On Feb 1, 2017, at 2:52 PM, Andy Witzig <cap1...@icloud.com> wrote:
> > 
> > Hi all,
> > 
> > I?m testing my application on a SMP workstation (dual Intel Xeon E5-2697
> > V4
> > 
> > 2.3 GHz Intel Broadwell (boost 2.8-3.1GHz) processors 128GB RAM) and am
> > seeing a 4x performance drop compared to a cluster system with 2.6GHz
> > Intel
> > Haswell with 20 cores / node and 128GB RAM/node.  Both applications have
> > been compiled using OpenMPI 1.6.4.  I have tried running:
> > 
> > 
> > mpirun -np 20 $EXECUTABLE $INPUT_FILE
> > mpirun -np 20 --mca btl self,sm $EXECUTABLE $INPUT_FILE
> > 
> > and others, but cannot achieve the same performance on the workstation as
> > is
> > 
> > seen on the cluster.  The workstation outperforms on other non-MPI but
> > multi-
> > threaded applications, so I don?t think it?s a hardware issue.
> > 
> > 
> > Any help you can provide would be appreciated.
> > 
> > Thanks,
> > cap79
> > _______________________________________________
> > users mailing list
> > users@lists.open-mpi.org
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> > 
> > _______________________________________________
> > users mailing list
> > users@lists.open-mpi.org
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> > 
> > _______________________________________________
> > users mailing list
> > users@lists.open-mpi.org
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> > 
> > _______________________________________________
> > users mailing list
> > users@lists.open-mpi.org
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> > 
> > 
> > _______________________________________________
> > users mailing list
> > users@lists.open-mpi.org
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> > 
> > _______________________________________________
> > users mailing list
> > users@lists.open-mpi.org
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> > 
> > 
> > _______________________________________________
> > users mailing list
> > users@lists.open-mpi.org
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> > 
> > 
> > _______________________________________________
> > users mailing list
> > users@lists.open-mpi.org
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Performance Issues on SMP Workstation

Reply via email to