I am using Open MPI version 2.0.1.
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Hi,
Thank you for you replies, but :-) it didn't work for me.
Using hpcc compiled with OpenMPI 2.0.1 :
I tried to use *export **PSM_MQ_RECVREQS_MAX=1000* as mentioned by
Howard, but the job didn't take into account the export (I am starting the
job from the home directory of a user, the home
Hi all,
I’m testing my application on a SMP workstation (dual Intel Xeon E5-2697 V4 2.3
GHz Intel Broadwell (boost 2.8-3.1GHz) processors 128GB RAM) and am seeing a 4x
performance drop compared to a cluster system with 2.6GHz Intel Haswell with 20
cores / node and 128GB RAM/node. Both applicat
Hi Wodel,
As you already figured out, mpirun -x … is the right way to do
it so the psm library will read the values when initializing on every node.
The default value for "PSM_MEMORY" is “normal” and you may change it to
“large”. If you want to look inside the code, it is on
https://github.com
By the way, the workstation has a total of 36 cores / 72 threads, so using
mpirun -np 20 is possible (and should be equivalent) on both platforms.
Thanks,
cap79
> On Feb 1, 2017, at 2:52 PM, Andy Witzig wrote:
>
> Hi all,
>
> I’m testing my application on a SMP workstation (dual Intel Xeon E
Hello Howard,
I was wondering if you have been able to look at this issue at all, or if
anyone has any ideas on what to try next.
Thank you,
Brendan
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Brendan Myers
Sent: Tuesday, January 24, 2017 11:11 AM
To: 'Open MPI Use
For this case: " a cluster system with 2.6GHz Intel Haswell with 20 cores /
node and 128GB RAM/node. "
are you running 5 ranks per node on 4 nodes?
What interconnect are you using for the cluster?
-Tom
> -Original Message-
> From: users [mailto:users-boun...@lists.open-mpi.org] On Beh
Hi Tom,
The cluster uses an Infiniband interconnect. On the cluster I’m requesting:
#PBS -l walltime=24:00:00,nodes=1:ppn=20. So technically, the run on the
cluster should be SMP on the node, since there are 20 cores/node. On the
workstation I’m just using the command: mpirun -np 20 …. I hav
Andy,
What allocation scheme are you using on the cluster. For some codes we see
noticeable differences using fillup vs round robin, not 4x though. Fillup is
more shared memory use while round robin uses more infinniband.
Doug
> On Feb 1, 2017, at 3:25 PM, Andy Witzig wrote:
>
> Hi Tom,
>
>
Honestly, I’m not exactly sure what scheme is being used. I am using the
default template from Penguin Computing for job submission. It looks like:
#PBS -S /bin/bash
#PBS -q T30
#PBS -l walltime=24:00:00,nodes=1:ppn=20
#PBS -j oe
#PBS -N test
#PBS -r n
mpirun $EXECUTABLE $INPUT_FILE
I’m not c
Simple test: replace your executable with “hostname”. If you see multiple hosts
come out on your cluster, then you know why the performance is different.
> On Feb 1, 2017, at 2:46 PM, Andy Witzig wrote:
>
> Honestly, I’m not exactly sure what scheme is being used. I am using the
> default tem
You may want to run this by Penguin support, too.
I believe that Penguin on Demand use Torque, in which case the
nodes=1:ppn=20
is requesting 20 cores on a single node.
If this is Torque, then you should get a host list, with counts by inserting
uniq -c $PBS_NODEFILE
after the last #P
Thanks, Bennet. I made the modification to the Torque submission file and got
“20 n388”, which confirms (like you said) that for my cluster runs I am
requesting 20 cores on a single node.
Best regards,
Andy
On Feb 1, 2017, at 5:15 PM, Bennet Fauber wrote:
You may want to run this by Penguin
Thank for the idea. I did the test and only get a single host.
Thanks,
Andy
On Feb 1, 2017, at 5:04 PM, r...@open-mpi.org wrote:
Simple test: replace your executable with “hostname”. If you see multiple hosts
come out on your cluster, then you know why the performance is different.
> On Feb 1
How do they compare if you run a much smaller number of ranks, say -np 2 or 4?
Is the workstation shared and doing any other work?
You could insert some diagnostics into your script, for example,
uptime and free, both before and after running your MPI program and
compare.
You could also run top
I have compiled OpenMPI 2.0.2 on a new Macbook running OS X 10.12 and have
been trying to run simple program.
I configured openmpi with
../configure --disable-shared --prefix ~/.local
make all install
Then I have a simple code only containing a call to MPI_Init.
I compile it with
mpirun -np 2 ./m
Thank you, Bennet. From my testing, I’ve seen that the application usually
performs better at much smaller ranks on the workstation. I’ve tested on the
cluster and do not see the same response (i.e. see better performance with
ranks of -np 15 or 20). The workstation is not shared and is not
17 matches
Mail list logo