Re: [OMPI users] scatter/gather, tcp, 3 nodes, homogeneous, # RAM

MM Tue, 26 Jul 2016 08:19:30 -0400 (EDT)

On 16 June 2016 at 00:46, Gilles Gouaillardet <gil...@rist.or.jp> wrote:
>
> Here is the idea on how to get the number of tasks per node
>
>
> MPI_Comm intranode_comm;
>
> int tasks_per_local_node;
>
> MPI_Comm_split_type(MPI_COMM_WORLD, MPI_COMM_TYPE_SHARED, 0, MPI_INFO_NULL, 
> &intranode_comm);
>
> MPI_Comm_size(intranode_comm, &tasks_per_local_node)
>
> MPI_Comm_free(&intranode_comm);
>
>
>
> then you can get the available memory per node, for example
> grep ^MemAvailable: /proc/meminfo
> and then divide this by the number of tasks on the local node.
>
>
>
> now if distribution should be based on cpu speed, you can simply retrieve 
> this value on each task, and then MPI_Gather() it to rank 0, and do the 
> distribution.
>
>
> in any case, if you MPI_Gather() the task parameter you are interested in, 
> you should be able to get rid of a static config file.
>
> non blocking collective are also available
> MPI_Igather[v] / MPI_Iscatter[v]
> if your algorithm can exploit this, that might be helpful
>


 I now programmatically collect this info:

rank/core:     cpufrequency                      maxmemavailable
=============================================
0                           f0
            m0
1                           f1
            m1
2                           f2
            m2
3                           f3
            m3
4                           f4
            m4

Let's say 1 task needs M bytes.
The maximum number of tasks I can do would be:       sum(mi) / M

Now let's say I am given N tasks to do.

1.  A solution that ignores  the memory available is to spread N over
the 5 cores given the highest frequency cores the highest number of
tasks in a prorata way, ie  f_i/ sum(f_i)   => n_i for each core.
sum(n_i) = N
2. A 2nd stage could then be to ensure that no:    n_i > m_i/M
which would then involve    taking any excesses (n_i - m_i/M) and
spreading it over the cores.


Or

perhaps both cpufrequencies and maxmem could be considered in 1 go,
but I don't know how to do that?

Thanks

MM

Re: [OMPI users] scatter/gather, tcp, 3 nodes, homogeneous, # RAM

Reply via email to