Thanks for the link - also setting CXX F77 and FC did the trick :)
./configure CC=icc CXX=icpc F77=ifort FC=ifort
--prefix=/usr/local/openmpi/1.6.2_intel_12.0.4 --with-sge
--with-hwloc=/usr/local/hwloc/1.5_intel_12.0.4
--with-openib-libdir=/usr/lib64 --with-udapl-libdir=/usr/lib64
works.
On Thu,
what isn't working is when i fire off an MPI job with over 800 ranks,
they don't all actually start up a process
fe, if i do srun -n 1024 --ntasks-per-node 12 xhpl
and then do a 'pgrep xhpl | wc -l', on all of the allocated nodes, not
all of them have actually started xhpl
most will read 12 star
turned on the daemon debugs for orted and noticed this difference
i get this on all the good nodes (ones that actually started xhpl)
Daemon was launched on node08 - beginning to initialize
[node08:21230] [[64354,0],1] orted_cmd: received add_local_procs
[node08:21230] [[64354,0],0] orted_rec
Hi
I don't use Slurm, and our clusters are fairly small (few tens of nodes,
few hundred cores).
Having said that, I know that Torque, which we use here,
requires specific system configuration changes on large clusters,
like increasing the maximum number of open files,
increasing the ARP cache siz
Something doesn't make sense here. If you direct launch with srun, there is no
orted involved. The orted only gets launched if you start with mpirun
Did you configure --with-pmi and point to where that include file resides?
Otherwise, the procs will all think they are singletons
Sent from my i
Hello everyone,
I'm a new student at my university, and I need to install LAMMPS software to
perform some molecular dynamic simulations for my work. The cluster I am
working on has no root access for me (obviously) and I am installing everything
on my local account. I'm having some difficulty