Hi guys, I have OSG installed as a role on Rock cluster installation for a cluster of 16 nodes, each node has 16 processors. I'm new with OSG so I let everything in default. When I submit mpi-ring example using qsub, if the number of slots is less than or equal to 16, all threads are run on a single random node. So I increase the number of slots to a number that larger than 16 hoping that they will run on different nodes, but actually they get errors.
Here is the script I used to submit mpi-ring: #!/bin/bash #$ -cwd #$ -S /bin/bash #$ -j y #$ -pe orte 8 mpirun $HOME/testmpi/mpi-ring (orte is one of 4 default parallel environments the system has) If I change the number of slots to 17 instead of 8, I get this error: APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11) also a stranged file was produced: core.xxxx Why do I cannot submit more thatn 16 slots? Thanks.
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users