Hi Ralph, thanks for your help!

Ralph Castain writes:
> It would have to be done via MPI_Info arguments, and we never had a
> request to do so (and hence, don't define such an argument). It would
> be easy enough to do so (look in the ompi/mca/dpm/orte/dpm_orte.c
> code).

Well, I wanted to just report success, but I've only got the easy
side of it: saving the arguments from the MPI_Info arguments into
the orte_job_t struct.  See attached "0003" patch (against trunk).
However, I couldn't figure out how to get the other side: reading out
the environment variables and setting them at fork.  Maybe you could
help with (or do :-) that?

Or just guide me as to where again: I threw abort()s in 'spawn'
functions I found under plm/, but my programs didn't abort and so I'm
not sure where they went.

> MPI implementations generally don't forcibly propagate envars because
> it is so hard to know which ones to handle - it is easy to propagate
> a system envar that causes bad things to happen on the remote end.

I understand.  Though in this case, I'm /trying/ to make Bad Things
(tm) happen ;-).

> One thing you could do, of course, is add that envar to your default
> shell setup (.bashrc or whatever). This would set the variable by
> default on your remote locations (assuming you are using rsh/ssh
> for your launcher), and then any process you start would get
> it. However, that won't help if this is an envar intended only for
> the comm_spawned process.

Unfortunately what I want to play with at the moment are LD_*
variables, and fiddling with these in my .bashrc will mess up a lot
more than just the simulation I am presently hacking.

> I can add this capability to the OMPI trunk, and port it to the 1.7
> release - but we don't go all the way back to the 1.4 series any
> more.

Yes, having this in a 1.7 release would be great!


BTW, I encountered a couple other small things while grepping through
source/waiting for trunk to build, so there are two other small patches
attached.  One gets rid of warnings about unused functions in generated
lexing code.  I believe the second fixes resource leaks on error paths.
However, it turned out none of my user-level code hit that function at
all, so I haven't been able to test it.  Take from it what you will...

-tom

> On Wed, Dec 11, 2013 at 2:10 PM, tom fogal <tfo...@sci.utah.edu> wrote:
> 
> > Hi all,
> >
> > I'm developing on Open MPI 1.4.5-ubuntu2 on Ubuntu 13.10 (so, Ubuntu's
> > packaged Open MPI) at the moment.
> >
> > I'd like to pass environment variables to processes started via
> > MPI_Comm_spawn.  Unfortunately, the MPI 3.0 standard (at least) does
> > not seem to specify a way to do this; thus I have been searching for
> > implementation-specific ways to accomplish my task.
> >
> > I have tried setting the environment variable using the POSIX setenv(3)
> > call, but it seems that Open MPI comm-spawn'd processes do not inherit
> > environment variables.  See the attached 2 C99 programs; one prints
> > out the environment it receives, and one sets the MEANING_OF_LIFE
> > environment variable, spawns the previous 'env printing' program, and
> > exits.  I run via:
> >
> >   $ env -i HOME=/home/tfogal \
> >   PATH=/bin:/usr/bin:/usr/local/bin:/sbin:/usr/sbin \
> >   mpirun -x TJFVAR=testing -n 5 ./mpienv ./envpar
> >
> > and expect (well, hope) to find the MEANING_OF_LIFE in 'envpar's
> > output.  I do see TJFVAR, but the MEANING_OF_LIFE sadly does not
> > propagate.  Perhaps I am asking the wrong question...
> >
> > I found another MPI implementation which allowed passing such
> > information via the MPI_Info argument, however I could find no
> > documentation of similar functionality in Open MPI.
> >
> > Is there a way to accomplish what I'm looking for?  I could even be
> > convinced to hack source, but a starting pointer would be appreciated.
> >
> > Thanks,
> >
> > -tom

From 8285a7625e5ea014b9d4df5dd65a7642fd4bc322 Mon Sep 17 00:00:00 2001
From: Tom Fogal <tfo...@alumni.unh.edu>
List-Post: users@lists.open-mpi.org
Date: Fri, 13 Dec 2013 12:03:56 +0100
Subject: [PATCH 1/3] btl: Remove warnings about unused lexing functions.

---
 ompi/mca/btl/openib/btl_openib_lex.l | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/ompi/mca/btl/openib/btl_openib_lex.l b/ompi/mca/btl/openib/btl_openib_lex.l
index 2aa6059..7455b78 100644
--- a/ompi/mca/btl/openib/btl_openib_lex.l
+++ b/ompi/mca/btl/openib/btl_openib_lex.l
@@ -1,3 +1,5 @@
+%option nounput
+%option noinput
 %{ /* -*- C -*- */
 /*
  * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
-- 
1.8.3.2

From dff9fd5ef69f09de6d0fee2236c39a79e8674f92 Mon Sep 17 00:00:00 2001
From: Tom Fogal <tfo...@alumni.unh.edu>
List-Post: users@lists.open-mpi.org
Date: Fri, 13 Dec 2013 13:06:41 +0100
Subject: [PATCH 2/3] mca: cleanup buf, ps when errors occur.

---
 orte/mca/plm/base/plm_base_proxy.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/orte/mca/plm/base/plm_base_proxy.c b/orte/mca/plm/base/plm_base_proxy.c
index 5d2b100..275cb3a 100644
--- a/orte/mca/plm/base/plm_base_proxy.c
+++ b/orte/mca/plm/base/plm_base_proxy.c
@@ -128,14 +128,15 @@ int orte_plm_proxy_spawn(orte_job_t *jdata)
     command = ORTE_PLM_LAUNCH_JOB_CMD;
     if (ORTE_SUCCESS != (rc = opal_dss.pack(buf, &command, 1, ORTE_PLM_CMD))) {
         ORTE_ERROR_LOG(rc);
+        OBJ_RELEASE(buf);
         goto CLEANUP;
     }
     
     /* pack the jdata object */
     if (ORTE_SUCCESS != (rc = opal_dss.pack(buf, &jdata, 1, ORTE_JOB))) {
         ORTE_ERROR_LOG(rc);
+        OBJ_RELEASE(buf);
         goto CLEANUP;
-        
     }
     
     /* create the proxy spawn object */
@@ -153,6 +154,7 @@ int orte_plm_proxy_spawn(orte_job_t *jdata)
                                           orte_rml_send_callback, NULL))) {
         ORTE_ERROR_LOG(rc);
         OBJ_RELEASE(buf);
+        OBJ_RELEASE(ps);
         goto CLEANUP;
     }
     
-- 
1.8.3.2

From a90f1fb49df1ff9442476b5e4294353ebb94498b Mon Sep 17 00:00:00 2001
From: Tom Fogal <tfo...@alumni.unh.edu>
List-Post: users@lists.open-mpi.org
Date: Fri, 13 Dec 2013 15:09:10 +0100
Subject: [PATCH 3/3] info: accept env vars desired in child processes

This looks for "env" keys in MPI_Info structures, which should be
then used to forward environment variables from parent to child
when spawning jobs.  However, note this doesn't (yet) change the
spawn machinery.
---
 ompi/mca/dpm/orte/dpm_orte.c | 12 ++++++++++++
 orte/runtime/orte_globals.c  |  2 ++
 orte/runtime/orte_globals.h  |  2 ++
 3 files changed, 16 insertions(+)

diff --git a/ompi/mca/dpm/orte/dpm_orte.c b/ompi/mca/dpm/orte/dpm_orte.c
index 65099a5..b61d6f2 100644
--- a/ompi/mca/dpm/orte/dpm_orte.c
+++ b/ompi/mca/dpm/orte/dpm_orte.c
@@ -680,6 +680,7 @@ static int spawn(int count, const char *array_of_commands[],
     char mapper[OPAL_PATH_MAX];
     int npernode;
     char slot_list[OPAL_PATH_MAX];
+    char envvar[1024]; /* better magic number? */
 
     orte_job_t *jdata;
     orte_app_context_t *app;
@@ -705,6 +706,7 @@ static int spawn(int count, const char *array_of_commands[],
        - "path": list of directories where to look for the executable
        - "file": filename, where additional information is provided.
        - "soft": see page 92 of MPI-2.
+       - "env": environment variables desired in the children.
     */
 
     /* setup the job object */
@@ -1358,6 +1360,16 @@ static int spawn(int count, const char *array_of_commands[],
                     jdata->stdin_target = strtoul(stdin_target, NULL, 10);
                 }
             }
+
+            /* did the user want us to forward any environment variables? */
+            ompi_info_get (array_of_info[i], "env", sizeof(envvar)-1, envvar,
+                           &flag);
+            if ( flag ) {
+              jdata->nenv_vars++;
+              jdata->env_vars = realloc(jdata->env_vars,
+                                        jdata->nenv_vars*sizeof(char*));
+              jdata->env_vars[jdata->nenv_vars-1] = strdup(envvar);
+            }
         }
 
         /* default value: If the user did not tell us where to look for the
diff --git a/orte/runtime/orte_globals.c b/orte/runtime/orte_globals.c
index f3e3029..e4ba975 100644
--- a/orte/runtime/orte_globals.c
+++ b/orte/runtime/orte_globals.c
@@ -742,6 +742,8 @@ static void orte_job_construct(orte_job_t* job)
     job->ckpt_snapshot_ref = NULL;
     job->ckpt_snapshot_loc = NULL;
 #endif
+    job->env_vars = NULL;
+    job->nenv_vars = 0;
 }
 
 static void orte_job_destruct(orte_job_t* job)
diff --git a/orte/runtime/orte_globals.h b/orte/runtime/orte_globals.h
index f284045..d12296b 100644
--- a/orte/runtime/orte_globals.h
+++ b/orte/runtime/orte_globals.h
@@ -463,6 +463,8 @@ typedef struct {
     /* snapshot location */
     char *ckpt_snapshot_loc;
 #endif
+    char** env_vars;
+    size_t nenv_vars;
 } orte_job_t;
 ORTE_DECLSPEC OBJ_CLASS_DECLARATION(orte_job_t);
 
-- 
1.8.3.2

Reply via email to