Hi Ralph, all, To give some background, I'm part of the RADICAL-Pilot [1] development team. RADICAL-Pilot is a Pilot System, an implementation of the Pilot (job) concept, which is in its most minimal form takes care of the decoupling of resource acquisition and workload management. So instead of launching your real_science.exe through PBS, you submit a Pilot, which will allow you to perform application level scheduling. Most obvious use-case if you want to run many (relatively) small tasks, then you really don;t want to go through the batch system every time. That is besides the fact that these machines are very bad in managing many tasks anyway.
The recent discussion we had on Spawn() on Cray's also originates here. I want to free myself from having to use aprun for every task, and therefore I am interested to see if ompi and/or orte can be the vehicle for that. > On 21 Jan 2015, at 17:16 , Ralph Castain <r...@open-mpi.org> wrote: > Theoretically, yes - see the ORCM project, which basically does what you ask. > The launch system in there isn’t fully implemented yet, but the fundamental > idea is valid and supports some range of capability. I looked a bit better at ORCM and it clearly overlaps with what I want to achieve. One thing I noticed is that parts of it runs as root, why is that? > We used to have a cmd line option in ORTE for what you propose - it wouldn’t > be too hard to restore. Is there some reason to do so? Can you point me to something that I could look for in the repo history, then I can see if it serves my purpose. If you think there is enough to warrant looking in more detail at ORCM I'm happy to do that too. Cheers, Mark 1. https://github.com/radical-cybertools/radical.pilot