Hello Ralph,

thanks for your answer.

> I can look to see if there is something generic we can do (perhaps enclosing each param in quotes to avoid any special character issues) - will see if something like that might help. Best that will happen, however, is that we launch the app and then have those procs spit out the error.

Yes, that makes sense.

> The space isn't a "special character" in that sense, and actually exists in some params.

Yes, and that's what I tried to show: That arguments with spaces are not correctly passed around, maybe the example is bad, because it produces an error in both cases. However, one error is expected the other not.

Maybe this will show it more clearly. Sorry for using an error case again, but I can't find a simpler way to reproduce at the moment:

Run: $ mpirun -mca mpi_show_handle_leaks "1 --foo" -n 1 -hostfile hosts example/mpi/code

This will give the *expected* error output (the argument including spaces is correctly passed around):

--------------------------------------------------------------------------
An invalid value was supplied for an enum variable.

  Variable     : mpi_show_handle_leaks
  Value        : 1 --foo
  Valid values : 0: f|false|disabled, 1: t|true|enabled
--------------------------------------------------------------------------
...

Now run: $ OMPI_MCA_mpi_show_handle_leaks="1 --foo" mpirun -n 1 -hostfile hosts example/mpi/code

This will give another *unexptected* error output:

orted: Error: unknown option "--foo"
Type 'orted --help' for usage.
Usage: orted [OPTION]...
   -am <arg0>            Aggregate MCA parameter set file list
-d|--debug               Debug the OpenRTE
   --daemonize           Daemonize the orted into the background
   --debug-daemons       Enable debugging of OpenRTE daemons
...

For whatever reason --foo is suddenly part of the orted options, which is wrong IMHO.

Why is there a difference if an option is passed via --mca and via an environment variable?

It's quite clear if I look in ./orte/mca/plm/rsh/plm_rsh_module.c (version 1.8.1):

/* in the rsh environment, we can append multi-word arguments
 * by enclosing them in quotes. Check for any multi-word
 * mca params passed to mpirun and include them
 */
cnt = opal_argv_count(orted_cmd_line);
for (i=0; i < cnt; i+=3) {
    /* check if the specified option is more than one word - all
     * others have already been passed
     */
    if (NULL != strchr(orted_cmd_line[i+2], ' ')) {
        /* must add quotes around it */
        asprintf(&param, "\"%s\"", orted_cmd_line[i+2]);
        /* now pass it along */
        opal_argv_append(&argc, &argv, orted_cmd_line[i]);
        opal_argv_append(&argc, &argv, orted_cmd_line[i+1]);
        opal_argv_append(&argc, &argv, param);
        free(param);
    }
}

However, env vars are handled like this:

/* if the value has a special character in it,
 * then protect it with quotes
 */
if (NULL != strchr(value, ';')) {
    char *p2;
    asprintf(&p2, "\"%s\"", value);
    opal_argv_append(&argc, &argv, p2);
    free(p2);
} else ...

Regards,
Dirk

Message: 3
Date: Fri, 18 Jul 2014 11:21:55 -0700
From: Ralph Castain <r...@open-mpi.org>
To: Open MPI Users <us...@open-mpi.org>
Subject: Re: [OMPI users] Incorrect escaping of OMPI_MCA environment
        variables with spaces (for rsh?)
Message-ID: <285b2f53-eb4d-4e26-a901-901186f24...@open-mpi.org>
Content-Type: text/plain; charset=us-ascii

I'm not exactly sure how to fix what you described. The semicolon is escaped because 
otherwise the cmd line would think it had been separated - the orted cmd line is ssh'd to 
the remote node and cannot include an unescaped terminator. The space isn't a 
"special character" in that sense, and actually exists in some params.

The reason you didn't get an immediate error is that the MCA param you flubbed is only 
read/used by the MPI layer, and mpirun isn't an MPI application. So mpirun has no 
visibility into that param. It gets included on the orted cmd line solely because it was 
given in the environment, and we don't have any current method for separating params out 
to say "this one doesn't apply to an orted".

So it got passed to the remote end. I can look to see if there is something 
generic we can do (perhaps enclosing each param in quotes to avoid any special 
character issues) - will see if something like that might help. Best that will 
happen, however, is that we launch the app and then have those procs spit out 
the error.

On Jul 18, 2014, at 9:45 AM, Dirk Schubert <dschub...@allinea.com> wrote:

Hi,

It seems that OMPI_MCA environment variables with spaces (for rsh?) are 
incorrectly escaped.

This happens with version 1.8 and 1.8.1. I did not try Version 1.7!

To reproduce:

0) ./configure && make && make install # no special configure flags required
1) Create a host file, with a couple of hostnames in it.
2) export OMPI_MCA_mpi_show_handle_leaks="1 --foo"
3) mpirun -hostfile /path/to/hostfile -n 1 /code/to/mpi.exe

Expected:

--------------------------------------------------------------------------
An invalid value was supplied for an enum variable.

  Variable     : mpi_show_handle_leaks
  Value        : 1 --foo
  Valid values : 0: f|false|disabled, 1: t|true|enabled
--------------------------------------------------------------------------
...

Actual:

orted: Error: unknown option "--foo"
Type 'orted --help' for usage.
Usage: orted [OPTION]...
   -am <arg0>            Aggregate MCA parameter set file list
...


A potential patch that fixes the problem (but then I don't know why ";" would 
be considered a special character in the first place).

--- ./orte/mca/plm/rsh/plm_rsh_module.c.orig    2014-07-18 17:25:38.477318000 
+0100
+++ ./orte/mca/plm/rsh/plm_rsh_module.c    2014-07-18 17:25:06.936780000 +0100
@@ -618,7 +618,7 @@
                     /* if the value has a special character in it,
                      * then protect it with quotes
                      */
-                    if (NULL != strchr(value, ';')) {
+                    if (NULL != strchr(value, ' ')) {
                         char *p2;
                         asprintf(&p2, "\"%s\"", value);
                         opal_argv_append(&argc, &argv, p2);

Regards,
Dirk
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/07/24812.php

Reply via email to