Jeff and I are taking a look at the logic in that code now - I know we thought we understood it back when we wrote it, but somehow it just doesn't look right any more...
On Jun 14, 2010, at 4:13 PM, Reuti wrote: > Hi, > > Am 13.06.2010 um 09:02 schrieb Zhang Linbo: > >> Hi, >> >> I'm new to OpenMPI and have encountered a problem with mpiexec. >> >> Since I need to set up the execution environment for OpenMPI programs >> on the execution nodes, I use the following command line to launch an >> OMPI program: >> >> mpiexec -launch-agent /some_path/myscript .... >> >> The problem is: the above command works fine if I invoke 'mpiexec' >> without an absolute path just like above (assuming the PATH variable >> is properly set), but if I prepend an absolute path to 'mpiexec', e.g.: >> >> /OMPI_dir/bin/mpiexec -launch-agent /some_path/myscript .... > > using an absolute path is equivalent to use the --prefix option to `mpiexec`. > Both ways lead obviously to the erroneous behavior you encounter. > > >> then I get the following error message: >> >> bash: -c: line 0: syntax error near unexpected token `(' >> bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ; >> LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; >> /some_path/myscript /OMPI_dir/bin/(null) --daemonize -mca ess env -mca >> orte_ess_jobid 1978662912 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 >> --hnp-uri "1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"' > > Reason seems to be, that in case of a given prefix the assembly of the > necessary command line includes some elements too much. I tried to circumvent > this by a new case in "orte/mca/plm/rsh/plm_rsh_module.c": > > if (orted_prefix != NULL) { > asprintf (&final_cmd, > "%s%s%s PATH=%s/%s:$PATH ; export PATH ; " > "LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ; export > LD_LIBRARY_PATH ; " > "%s", > (opal_prefix != NULL ? "OPAL_PREFIX=" : ""), > (opal_prefix != NULL ? opal_prefix : ""), > (opal_prefix != NULL ? " ; export OPAL_PREFIX;" : ""), > prefix_dir, bin_base, > prefix_dir, lib_base, > orted_prefix ); > } > else { > asprintf (&final_cmd, > "%s%s%s PATH=%s/%s:$PATH ; export PATH ; " > "LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ; export > LD_LIBRARY_PATH ; " > "%s %s/%s/%s", > (opal_prefix != NULL ? "OPAL_PREFIX=" : ""), > (opal_prefix != NULL ? opal_prefix : ""), > (opal_prefix != NULL ? " ; export OPAL_PREFIX;" : ""), > prefix_dir, bin_base, > prefix_dir, lib_base, > (orted_prefix != NULL ? orted_prefix : ""), > prefix_dir, bin_base, > orted_cmd); > } > > The name of the agent is for sake of easiness stored in "opal_prefix" AFAICS. > > This is of course not a clean solution (as "opal_prefix" can't be used any > more), but more a proof of concept, as only sh-like shelle are handled. Sure > there are better ways to solve it. Anyway, it's a bug and should be filed > > -- Reuti > > >> I'd like to know what causes the above problem and how should I deal with it. >> I want to use absolute pathname of mpiexec to avoid possible inteferences >> with other MPI installations. Thanks in advance. >> >> LB >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users