Hi guys,

I have OSG installed as a role on Rock cluster installation for a cluster
of 16 nodes, each node has 16 processors. I'm new with OSG so I let
everything in default.
When I submit mpi-ring example using qsub, if the number of slots is less
than or equal to 16, all threads are run on a single random node. So I
increase the number of slots to a number that larger than 16 hoping that
they will run on different nodes, but actually they get errors.

Here is the script I used to submit mpi-ring:
#!/bin/bash

#$ -cwd
#$ -S /bin/bash
#$ -j y
#$ -pe orte 8
mpirun $HOME/testmpi/mpi-ring

(orte is one of 4 default parallel environments the system has)

If I change the number of slots to 17 instead of 8, I get this error:
APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
also a stranged file was produced: core.xxxx

Why do I cannot submit more thatn 16 slots?

Thanks.
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to