Hello,
I have the following 3 1-socket nodes:

node1:  4GB RAM 2-core: rank 0  rank 1
node2:  4GB RAM 4-core: rank 2  rank 3 rank 4 rank 5
node3:  8GB RAM 4-core: rank 6  rank 7 rank 8 rank 9

I have a model that takes a input and produces a output, and I want to run
this model for N possible combinations of inputs. N is very big and i am
limited by memory capacity.

I am using the world collective and I want to know how to distribute N over
the 10 ranks, given the mem specs of each node.

For now, i have been simply dividing N by the number of ranks and
scatter/gather that way.
How can I improve without hardcoding the specs in my own c++ code?

thanks,

Reply via email to