Hi John,

Mpiexec isn't needed with OMPI, in fact if you are using the one from OSC, it only works with MPICH.

Instead just build OMPI with --with-tm, and it will link against TORQUE and start up and track jobs properly.

-Joshua Bernstein
Penguin Computing

On Mar 14, 2010, at 21:35, "John R. Cary" <c...@txcorp.com> wrote:

I have a script that launches a bunch of runs on some compute nodes of
a cluster.  Once I get through the queue, I query PBS for my machine
file, then I copy that to a local file 'nodes' which I use for mpiexec:

mpiexec -machinefile /home/research/cary/projects/vpall/vptests/ nodes -np 6 /hom e/research/cary/projects/vpall/builds/vorpal/par/vorpal/vorpal -i bathtubAntenna
.in -dim 2 -o bathtubAntenna2p -n 100 -d 100

but this fails with

[node47:07004] [[25769,0],0] ORTE_ERROR_LOG: File open failure in file ../../../
../../orte/mca/ras/tm/ras_tm_module.c at line 153
[node47:07004] [[25769,0],0] ORTE_ERROR_LOG: File open failure in file ../../../
../../orte/mca/ras/tm/ras_tm_module.c at line 87
[node47:07004] [[25769,0],0] ORTE_ERROR_LOG: File open failure in file ../../../
../orte/mca/ras/base/ras_base_allocate.c at line 133
[node47:07004] [[25769,0],0] ORTE_ERROR_LOG: File open failure in file ../../../
../orte/mca/plm/base/plm_base_launch_support.c at line 72
[node47:07004] [[25769,0],0] ORTE_ERROR_LOG: File open failure in file ../../../
../../orte/mca/plm/tm/plm_tm_module.c at line 167
--- --- -------------------------------------------------------------------- A daemon (pid unknown) died unexpectedly on signal 1 while attempting to
launch so we are aborting.

The appropriate code snippet is

    /* setup the full path to the PBS file */
    filename = opal_os_path(false, mca_ras_tm_component.nodefile_dir,
                            pbs_jobid, NULL);
    fp = fopen(filename, "r");
    if (NULL == fp) {
        ORTE_ERROR_LOG(ORTE_ERR_FILE_OPEN_FAILURE);
        free(filename);
        return ORTE_ERR_FILE_OPEN_FAILURE;
    }

which kind of looks like it might be trying to open my pbs file instead
of the file I gave on the command line?  I really don't know, but does
anyone have any ideas here?

Thx....John Cary
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to