We're going to take this off-list so we quit peppering you all with the development...will report back when we have something more concrete should anyone else be interested.
On Wed, Feb 4, 2015 at 2:22 AM, Mark Santcroos <mark.santcr...@rutgers.edu> wrote: > Ok great, sounds like a plan! > > > On 04 Feb 2015, at 2:53 , Ralph Castain <r...@open-mpi.org> wrote: > > > > Appreciate your patience! I'm somewhat limited this week by being on > travel to our HQ, so I don't have access to my usual test cluster. I'll be > better situated to complete the implementation once I get home. > > > > For now, some quick thoughts: > > > > 1. stdout/stderr: yes, I just need to "register" orte-submit as the one > to receive those from the submitted job. > > > > 2. That one is going to be a tad trickier, but is resolvable. May take > me a little longer to fix. > > > > 3. dang - I thought I had it doing so. I'll look to find the issue. I > suspect it's just a case of correctly setting the return code of > orte-submit. > > > > I'd welcome the help! Let me ponder the best way to point you to the > areas needing work, and we can kick around off-list about who does what. > > > > Great to hear this is working with your tool so quickly!! > > Ralph > > > > > > On Tue, Feb 3, 2015 at 3:49 PM, Mark Santcroos < > mark.santcr...@rutgers.edu> wrote: > > Hi Ralph, > > > > Besides the items in the other mail, I have three more items that would > need resolving at some point. > > > > 1. STDOUT/STDERR currently go to the orte-dvm console. > > I'm sure this is not a fundamental limitation. > > Even if getting the information to the orte-submit instance would be > problematic, the orte-dvm writing this to a file per session would be good > enough too. > > > > 2. Failing applications currently tear down the dvm. > > Ideally that would not be the case, and this would be handled in > relation to item (3). > > Possibly this needs to be configurable, if others would like to see > different behaviour. > > > > 3. orte-submit doesn't return the exit code of the application. > > > > To be clear, I realise the current implementation is a proof of concept, > so these are no complaints, just wishes of where I hope to see this going! > > > > FWIW: these items might require less intricate knowledge of OMPI in > general, so with some pointers/guidance I can probably work on these myself > if needed. > > > > Cheers, > > > > Mark > > > > ps. I did a quick-and-dirty integration with our own tool and the ORTE > abstraction maps like a charm! > > ( > https://github.com/radical-cybertools/radical.pilot/commit/2d36e886081bf8531097edfc95ada1826257e460 > ) > > > > > On 03 Feb 2015, at 20:38 , Mark Santcroos <mark.santcr...@rutgers.edu> > wrote: > > > > > > Hi Ralph, > > > > > >> On 03 Feb 2015, at 16:28 , Ralph Castain <r...@open-mpi.org> wrote: > > >> I think I fixed some of the handshake issues - please give it another > try. > > >> You should see orte-submit properly shutdown upon completion, > > > > > > Indeed, it works on my laptop now! Great! > > > It feels quite fast too, for sort tasks :-) > > > > > >> and orte-dvm properly shutdown when sent the terminate cmd. > > > > > > ACK. This also works as expected. > > > > > >> I was able to cleanly run MPI jobs on my laptop. > > > > > > Do you also see the following errors/warnings on the dvm side? > > > > > > [netbook:28324] [[20896,0],0] Releasing job data for [INVALID] > > > Hello, world, I am 0 of 1, (Open MPI v1.9a1, package: Open MPI > mark@netbook Distribution, ident: 1.9.0a1, repo rev: dev-811-g7299cc3, > Unreleased developer copy, 132) > > > [netbook:28324] sess_dir_finalize: proc session dir does not exist > > > [netbook:28324] [[20896,0],0] dvm: job [20896,20] has completed > > > [netbook:28324] [[20896,0],0] Releasing job data for [20896,20] > > > > > > The "INVALID" message is there for every "submit", the > sess_dir_finalize exists per instance/core. > > > Is that something to worry about, that needs fixing or is that a > configuration issue? > > > > > > I haven't been able to test on Edison because of maintenance > (today+tomorrow), so I will report on that later. > > > > > > Thanks again! > > > > > > Mark > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/02/26282.php > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/02/26284.php > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/02/26289.php >