Jeff and I are taking a look at the logic in that code now - I know we thought 
we understood it back when we wrote it, but somehow it just doesn't look right 
any more...


On Jun 14, 2010, at 4:13 PM, Reuti wrote:

> Hi,
> 
> Am 13.06.2010 um 09:02 schrieb Zhang Linbo:
> 
>> Hi,
>> 
>> I'm new to OpenMPI and have encountered a problem with mpiexec.
>> 
>> Since I need to set up the execution environment for OpenMPI programs
>> on the execution nodes, I use the following command line to launch an
>> OMPI program:
>> 
>>  mpiexec -launch-agent /some_path/myscript ....
>> 
>> The problem is: the above command works fine if I invoke 'mpiexec'
>> without an absolute path just like above (assuming the PATH variable
>> is properly set), but if I prepend an absolute path to 'mpiexec', e.g.:
>> 
>>  /OMPI_dir/bin/mpiexec -launch-agent /some_path/myscript ....
> 
> using an absolute path is equivalent to use the --prefix option to `mpiexec`. 
> Both ways lead obviously to the erroneous behavior you encounter.
> 
> 
>> then I get the following error message:
>> 
>> bash: -c: line 0: syntax error near unexpected token `('
>> bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ; 
>> LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; 
>> /some_path/myscript /OMPI_dir/bin/(null) --daemonize -mca ess env -mca 
>> orte_ess_jobid 1978662912 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 
>> --hnp-uri "1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"'
> 
> Reason seems to be, that in case of a given prefix the assembly of the 
> necessary command line includes some elements too much. I tried to circumvent 
> this by a new case in "orte/mca/plm/rsh/plm_rsh_module.c":
> 
>            if (orted_prefix != NULL) {
>            asprintf (&final_cmd,
>                      "%s%s%s PATH=%s/%s:$PATH ; export PATH ; "
>                      "LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ; export 
> LD_LIBRARY_PATH ; "
>                      "%s",
>                      (opal_prefix != NULL ? "OPAL_PREFIX=" : ""),
>                      (opal_prefix != NULL ? opal_prefix : ""),
>                      (opal_prefix != NULL ? " ; export OPAL_PREFIX;" : ""),
>                      prefix_dir, bin_base,
>                      prefix_dir, lib_base,
>                      orted_prefix );
>            }
>            else {
>            asprintf (&final_cmd,
>                      "%s%s%s PATH=%s/%s:$PATH ; export PATH ; "
>                      "LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ; export 
> LD_LIBRARY_PATH ; "
>                      "%s %s/%s/%s",
>                      (opal_prefix != NULL ? "OPAL_PREFIX=" : ""),
>                      (opal_prefix != NULL ? opal_prefix : ""),
>                      (opal_prefix != NULL ? " ; export OPAL_PREFIX;" : ""),
>                      prefix_dir, bin_base,
>                      prefix_dir, lib_base,
>                      (orted_prefix != NULL ? orted_prefix : ""),
>                      prefix_dir, bin_base,
>                      orted_cmd);
>            }
> 
> The name of the agent is for sake of easiness stored in "opal_prefix" AFAICS.
> 
> This is of course not a clean solution (as "opal_prefix" can't be used any 
> more), but more a proof of concept, as only sh-like shelle are handled. Sure 
> there are better ways to solve it. Anyway, it's a bug and should be filed
> 
> -- Reuti
> 
> 
>> I'd like to know what causes the above problem and how should I deal with it.
>> I want to use absolute pathname of mpiexec to avoid possible inteferences
>> with other MPI installations. Thanks in advance.
>> 
>> LB
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to