Hi Brian,
We don't use MPI. 
Our use model initially launches a primary task (& a few lightweight child 
tasks) in the primary launch host.Thereafter, the primary task remotely 
launches parallel tasks in other allocated nodes through rsh/ssh or 
(preferably) a slurm equivalent app for remote task launch.(for eg., in LSF 
it's blaunch). We need a slurm equivalent remote task launch app as well, we 
don't know if there is any one yet.
It's imperative that our primary task launches in the primary launch host as it 
needs to talk to the h/w beneath before going forward with other remote 
launches.
-Bhaskar.
    On Monday 4 November, 2024 at 06:44:44 am IST, Brian Andrus via slurm-users 
<slurm-users@lists.schedmd.com> wrote:  
 
  
Bhaskar,
 
As I think about it, that assignment of process 0's node may well be something 
that is from your mpi, since that is where you can decide how to layout the 
processes (pack a node or equally, etc). I would look at the options/settings 
that apply to the particular flavor of mpi you are using.
 
For example, in openmpi, it has:
 
    To order processes' ranks in MPI_COMM_WORLD:
 
        --rank-by <foo>
               Rank in round-robin fashion according to the specified object, 
defaults to slot. Supported options include slot, hwthread, core, L1cache, 
L2cache, L3cache, socket, numa, board, and node.
 
 
Brian Andrus
 
 On 11/3/2024 12:06 AM, Bhaskar Chakraborty wrote:
  
 
 Hi Brian, Thanks for the response! However, this particular approach where we 
need to accept whatever slurm gives us as starting node and deal with it 
accordingly doesn’t work for us. 
  I think there should be flexibility in slurm to switch the starting node as 
requested, through some C API. This is possible in other scheduling system like 
LSF. 
  Any other way to do this with the current slurm code base is welcome. 
  Regards, Bhaskar.
 
 
 Sent from Yahoo Mail for iPad
  
 
On Friday, November 1, 2024, 1:12 AM, Brian Andrus via slurm-users 
<slurm-users@lists.schedmd.com> wrote:
 
   
Likely many ways to do this, but if you have some code that is dependent on 
something, that check could be in the code itself.
 
So instead of process 0 being the required process to run, it would be 
whichever process meets the requirements.
 
eg:
 
case hostname:
 harold)
     Run harold's stuff here
 *)
     Run all other stuff here
 esac
 
Takes some coding effort but keeps control of the processes within your own 
code.
 
Brian Andrus
 
  On 10/30/24 09:35, Bhaskar Chakraborty via slurm-users wrote:
  
 
      Hi, 
  Is there a way to change/control the primary node (i.e. where the initial 
task starts) as part of a job's allocation. 
  For eg, if a job requires 6 CPUs & its allocation is distributed over 3 hosts 
h1, h2 & h3 I find that it always starts the task in 1 particular node (say h1) 
irrespective of how many slots were available in the hosts. 
  Can we somehow let slurm have the primary node as h2? 
  Is there any C-API inside select plugin which can do this trick if we were to 
control it through the configured select plugin? 
  Thanks. -Bhaskar.  
     
 -- 
 slurm-users mailing list -- slurm-users@lists.schedmd.com
 To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
  
  
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
  
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to