Hi, Has anyone here used htcondor scheduler with mpi jobs? I followed the example, openmpiscript, in the condor folder like this
[mahmood@rocks7 ~]$ cat mpi.ht universe = parallel executable = openmpiscript arguments = mpihello log = hellompi.log output = hellompi.out error = hellompi.err machine_count = 2 However, it fails with this error ​ [mahmood@rocks7 ~]$ cat hellompi.out WARNING: MOUNT_UNDER_SCRATCH not set in condor_config WARNING: MOUNT_UNDER_SCRATCH not set in condor_config ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- -------------------------------------------------------------------------- mpirun detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[62274,1],0] Exit code: 1 -------------------------------------------------------------------------- [mahmood@rocks7 ~]$ cat hellompi.err Not defined: MOUNT_UNDER_SCRATCH Not defined: MOUNT_UNDER_SCRATCH [compute-0-1.local:17511] [[62274,1],0] usock_peer_recv_connect_ack: received unexpected process identifier [[62274,0],2] from [[62274,0],1] [compute-0-1.local:17512] [[62274,1],1] usock_peer_recv_connect_ack: received unexpected process identifier [[62274,0],2] from [[62274,0],1] ​ A ​ny idea? ​ Regards, Mahmood
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users