Thanks a lot!

now, for the ev "OMPI_MCA_orte_nodes", what do I put exactly? our
nodes have a short/long name (it's rhel 5.x, so the command hostname
returns the long name) and at least 2 IP addresses.

p.

On Tue, Jul 27, 2010 at 12:06 AM, Ralph Castain <r...@open-mpi.org> wrote:
> Okay, fixed in r23499. Thanks again...
>
>
> On Jul 26, 2010, at 9:47 PM, Ralph Castain wrote:
>
>> Doh - yes it should! I'll fix it right now.
>>
>> Thanks!
>>
>> On Jul 26, 2010, at 9:28 PM, Philippe wrote:
>>
>>> Ralph,
>>>
>>> i was able to test the generic module and it seems to be working.
>>>
>>> one question tho, the function orte_ess_generic_component_query in
>>> "orte/mca/ess/generic/ess_generic_component.c" calls getenv with the
>>> argument "OMPI_MCA_enc", which seems to cause the module to fail to
>>> load. shouldnt it be "OMPI_MCA_ess" ?
>>>
>>> .....
>>>
>>>   /* only pick us if directed to do so */
>>>   if (NULL != (pick = getenv("OMPI_MCA_env")) &&
>>>                0 == strcmp(pick, "generic")) {
>>>       *priority = 1000;
>>>       *module = (mca_base_module_t *)&orte_ess_generic_module;
>>>
>>> ...
>>>
>>> p.
>>>
>>> On Thu, Jul 22, 2010 at 5:53 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>> Dev trunk looks okay right now - I think you'll be fine using it. My new 
>>>> component -might- work with 1.5, but probably not with 1.4. I haven't 
>>>> checked either of them.
>>>>
>>>> Anything at r23478 or above will have the new module. Let me know how it 
>>>> works for you. I haven't tested it myself, but am pretty sure it should 
>>>> work.
>>>>
>>>>
>>>> On Jul 22, 2010, at 3:22 PM, Philippe wrote:
>>>>
>>>>> Ralph,
>>>>>
>>>>> Thank you so much!!
>>>>>
>>>>> I'll give it a try and let you know.
>>>>>
>>>>> I know it's a tough question, but how stable is the dev trunk? Can I
>>>>> just grab the latest and run, or am I better off taking your changes
>>>>> and copy them back in a stable release? (if so, which one? 1.4? 1.5?)
>>>>>
>>>>> p.
>>>>>
>>>>> On Thu, Jul 22, 2010 at 3:50 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>> It was easier for me to just construct this module than to explain how 
>>>>>> to do so :-)
>>>>>>
>>>>>> I will commit it this evening (couple of hours from now) as that is our 
>>>>>> standard practice. You'll need to use the developer's trunk, though, to 
>>>>>> use it.
>>>>>>
>>>>>> Here are the envars you'll need to provide:
>>>>>>
>>>>>> Each process needs to get the same following values:
>>>>>>
>>>>>> * OMPI_MCA_ess=generic
>>>>>> * OMPI_MCA_orte_num_procs=<number of MPI procs>
>>>>>> * OMPI_MCA_orte_nodes=<a comma-separated list of nodenames where MPI 
>>>>>> procs reside>
>>>>>> * OMPI_MCA_orte_ppn=<number of procs/node>
>>>>>>
>>>>>> Note that I have assumed this last value is a constant for simplicity. 
>>>>>> If that isn't the case, let me know - you could instead provide it as a 
>>>>>> comma-separated list of values with an entry for each node.
>>>>>>
>>>>>> In addition, you need to provide the following value that will be unique 
>>>>>> to each process:
>>>>>>
>>>>>> * OMPI_MCA_orte_rank=<MPI rank>
>>>>>>
>>>>>> Finally, you have to provide a range of static TCP ports for use by the 
>>>>>> processes. Pick any range that you know will be available across all the 
>>>>>> nodes. You then need to ensure that each process sees the following 
>>>>>> envar:
>>>>>>
>>>>>> * OMPI_MCA_oob_tcp_static_ports=6000-6010  <== obviously, replace this 
>>>>>> with your range
>>>>>>
>>>>>> You will need a port range that is at least equal to the ppn for the job 
>>>>>> (each proc on a node will take one of the provided ports).
>>>>>>
>>>>>> That should do it. I compute everything else I need from those values.
>>>>>>
>>>>>> Does that work for you?
>>>>>> Ralph
>>>>>>
>>>>>>

Reply via email to