On 24 May 2012 12:40, George Bosilca <bosi...@eecs.utk.edu> wrote:
> On May 24, 2012, at 11:22 , Jeff Squyres wrote:
>
>> On May 24, 2012, at 11:10 AM, Lisandro Dalcin wrote:
>>
>>>> So I checked them all, and I found SCATTERV, GATHERV, and REDUCE_SCATTER 
>>>> all had the issue.  Now fixed on the trunk, and will be in 1.6.1.
>>>
>>> Please be careful with REDUCE_SCATTER[_BLOCK] . My understanding of
>>> the MPI standard is that the the length of the recvcounts array is the
>>> local group size
>>> (http://www.mpi-forum.org/docs/mpi22-report/node113.htm#Node113)
>>
>>
>> I read that this morning and it made my head hurt.
>>
>> I read it to be: reduce the data in the local group, scatter the results to 
>> the remote group.
>>
>> As such, the reduce COUNT is sum(recvcounts), and is used for the reduction 
>> in the local group.  Then use recvcounts to scatter it to the remote group.
>>
>> …right?
>
> Right, you reduce locally but you scatter remotely. As such the size of the 
> recvcounts buffer is the remote size. As in the local group you do a reduce 
> (where every process participate with the same amount of data) you only need 
> a total count which in this case is the sum of all recvcounts. This 
> requirement is enforced by the fact that the input buffer is of size sum of 
> all recvcounts, which make sense only if you know the remote group receives 
> counts.

The standard says this:

"Within each group, all processes provide the same recvcounts
argument, and provide input vectors of  sum_i^n recvcounts[i] elements
stored in the send buffers, where n is the size of the group"

So, I read " Within each group, ... where n is the size of the group"
as being the LOCAL group size.

>
> I don't see much difference with the other collective. The generic behavior 
> is that you apply the operation on the local group but the result is moved 
> into the remote group.
>

Well, for me this one DO IS different (for example, SCATTER is
unidirectional for intercomunicators, but REDUCE_SCATTER is
bidirectional). The "recvbuff" is a local buffer, but you understand
"recvcounts" as remote.

Mmm, the standard is really confusing in this point...

-- 
Lisandro Dalcin
---------------
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
3000 Santa Fe, Argentina
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169

Reply via email to