On Apr 15, 2011, at 2:59 AM, Reuti wrote: > Hi, > > Am 15.04.2011 um 07:25 schrieb Asad Ali: > >> <snip> >> Yes. The entire job gets restarted. > > maybe this is caused by a signal sent to the job by Condor, so that it gets > terminated and as a result Condor restarts it. Can you trap the signals in > your appliaction for a test? > > >> If so, you had best talk to the Condor folks - it has nothing to do with >> Open MPI, but is due to a job control flag you are passing to Condor. >> >> I have talked to them several times. But most of the cluster users are >> non-mpi users and thus they don't have much knowledge about the >> configuration of MPI with Condor. >> If you know any person who uses Condor for running MPI jobs then please let >> me know. > > Is the use of Open MPI supported by Condor? In former times they had a > special universe for MPICH(1) and only for an older version to run parallel > jobs under Condor. Did this change?
See https://bugzilla.redhat.com/show_bug.cgi?id=537232 At one time, it appears such a script existed. You might start with the one offered here, and/or check on the web for updates. I would also go to the Condor web site: http://www.cs.wisc.edu/condor/ A search for "openmpi" revealed several presentations on how to make this work. > > -- Reuti > > >> Cheers, >> >> Asad >> >> >> >> On Apr 14, 2011, at 6:37 PM, Asad Ali wrote: >> >>> Hi all, >>> >>> I am using Condor to run my MPI jobs on a large cluster of nodes. The jobs >>> run fine but after sometimes they automatically get restarted. What can be >>> the reason? >>> >>> Cheers, >>> >>> Asad >>> >>> -- >>> "A Bayesian is one who, vaguely expecting a horse, and catching a glimpse >>> of a donkey, strongly believes he has seen a mule." >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> -- >> "A Bayesian is one who, vaguely expecting a horse, and catching a glimpse of >> a donkey, strongly believes he has seen a mule." >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users