It’s working just fine, Howard - we found the problem.
> On Mar 25, 2015, at 9:12 AM, Howard Pritchard <hpprit...@gmail.com> wrote:
>
> Mark,
>
> If you're wanting to use the orte-submit feature, you will need to get mpirun
> working.
>
> Could you rerun using the mpirun launch method but with
>
> --mca oob_base_verbose 10 --mca ess_base_verbose 2
>
> set?
>
>
> Also, you may want to make sure you are using the ipogif0 eth device. This
> can be controlled using the oob_tcp_if_include mca parameter, i.e.
>
> mpirun --mca oob_tcp_if_include ipogif0
>
> I'm assuming your use case doesn't require connectivity between processes
> running on the compute nodes and some external service in suggesting this
> option.
>
> 2015-03-25 8:14 GMT-06:00 Mark Santcroos <mark.santcr...@rutgers.edu
> <mailto:mark.santcr...@rutgers.edu>>:
> Hi Howard,
>
> > On 25 Mar 2015, at 14:58 , Howard Pritchard <hpprit...@gmail.com
> > <mailto:hpprit...@gmail.com>> wrote:
> > How are you building ompi?
>
> My configure is rather straightforward:
> ./configure --prefix=$OMPI_PREFIX --disable-getpwuid
>
> Maybe I got spoiled on Hopper/Edison and I need more explicit configuration
> on BW ...
>
> > Also what happens if you use. aprun.
>
> Not sure if you meant in combination with mpirun or not, so I'll provide both:
>
> > aprun -n2 ./a.out
> Hello from rank 1, thread 0, on nid16869. (core affinity = 0)
> Hello from rank 0, thread 0, on nid16868. (core affinity = 0)
> After sleep from rank 1, thread 0, on nid16869. (core affinity = 0)
> After sleep from rank 0, thread 0, on nid16868. (core affinity = 0)
> Application 23791589 resources: utime ~0s, stime ~2s, Rss ~27304, inblocks
> ~13229, outblocks ~66
>
> > aprun -n2 mpirun ./a.out
> apstat: error opening /ufs/alps_shared/reservations: No such file or directory
> apstat: error opening /ufs/alps_shared/reservations: No such file or directory
> [nid16868:17876] [[699,0],0] ORTE_ERROR_LOG: File open failure in file
> ../../../../../orte/mca/ras/tm/ras_tm_module.c at line 159
> [nid16868:17876] [[699,0],0] ORTE_ERROR_LOG: File open failure in file
> ../../../../../orte/mca/ras/tm/ras_tm_module.c at line 85
> [nid16868:17876] [[699,0],0] ORTE_ERROR_LOG: File open failure in file
> ../../../../orte/mca/ras/base/ras_base_allocate.c at line 190
> [nid16869:17034] [[9344,0],0] ORTE_ERROR_LOG: File open failure in file
> ../../../../../orte/mca/ras/tm/ras_tm_module.c at line 159
> [nid16869:17034] [[9344,0],0] ORTE_ERROR_LOG: File open failure in file
> ../../../../../orte/mca/ras/tm/ras_tm_module.c at line 85
> [nid16869:17034] [[9344,0],0] ORTE_ERROR_LOG: File open failure in file
> ../../../../orte/mca/ras/base/ras_base_allocate.c at line 190
> Application 23791590 exit codes: 1
> Application 23791590 resources: utime ~0s, stime ~2s, Rss ~27304, inblocks
> ~9596, outblocks ~478
>
> > I work with ompi on the nersc edison and hopper daily.
>
> I use Edison and Hopper too, and there it works for me indeed.
>
> > typically i use aprun though.
>
> I want to use orte-submit and friends, so I "explicitly" don't want to use
> aprun.
>
> > you definitely dont need to use ccm.
> > and shouldnt.
>
> Depends on the use-case, but happy to leave that out of scope for now :-)
>
> Thanks!
>
> Mark
>
>
> >
> > On Mar 25, 2015 6:00 AM, "Mark Santcroos" <mark.santcr...@rutgers.edu
> > <mailto:mark.santcr...@rutgers.edu>> wrote:
> > Hi,
> >
> > Any users of Open MPI on Blue Waters here?
> > And then I specifically mean in "native" mode, not inside CCM.
> >
> > After configuring and building as I do on other Cray's, mpirun gives me the
> > following:
> > [nid25263:31700] [[23896,0],0] ORTE_ERROR_LOG: Authentication failed in
> > file ../../../../../orte/mca/oob/tcp/oob_tcp_connection.c at line 803
> > [nid25263:31700] [[23896,0],0] ORTE_ERROR_LOG: Authentication failed in
> > file ../../../../../orte/mca/oob/tcp/oob_tcp_connection.c at line 803
> >
> > Version is the latest and greatest from git.
> >
> > So I'm interested to hear whether people have been successful on Blue
> > Waters and/or whether the error rings a bell for people.
> >
> > Thanks!
> >
> > Mark
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org <mailto:us...@open-mpi.org>
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > <http://www.open-mpi.org/mailman/listinfo.cgi/users>
> > Link to this post:
> > http://www.open-mpi.org/community/lists/users/2015/03/26505.php
> > <http://www.open-mpi.org/community/lists/users/2015/03/26505.php>
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org <mailto:us...@open-mpi.org>
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > <http://www.open-mpi.org/mailman/listinfo.cgi/users>
> > Link to this post:
> > http://www.open-mpi.org/community/lists/users/2015/03/26506.php
> > <http://www.open-mpi.org/community/lists/users/2015/03/26506.php>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org <mailto:us...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/03/26507.php
> <http://www.open-mpi.org/community/lists/users/2015/03/26507.php>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/03/26520.php