Rob,

Thank you very much for the suggestion. There are two independent scenarios 
using parallel IO in my code:

1. MPI processes conditionally print, i.e., some processes print in current 
loop (but may not print in next loop), some processes do not print in current 
loop (but may print next loop), and it does not matter who prints first or last 
(NOT ordered). Clearly we cannot use a collective call for this scenario 
because it is conditional, and I don't need it to be ordered, so I chose 
MPI_File_write_shared (non-collective operation, shared pointer, but not 
ordered). It works well if Lustre is mounted with "flock", but does not work 
without "flock".

In this scenario 1, we cannot use individual pointer or explicit offset because 
we cannot predetermine the offset for each process. That is why I had to use a 
shared file pointer.

2. Each MPI process unconditionally prints to a shared file (even if it prints 
nothing) and the order does not matter. Your suggestion works for this 
scenario. Actually it is even simpler because order does not matter. We have 
two options:  (2A) use shared file pointer, either MPI_File_write_shared 
(non-collective) or MPI_File_write_ordered (collective) works, and don't need 
to predetermine offset, it requires "flock". (2B). use individual file pointer, 
e.g., MPI_File_seek (or MPI_File_set_view) and MPI_File_write_all (collective), 
this requires calculating offset, which is pre-determinable. It does not 
require "flock".

In summary, scenario 2 can avoid "flock" requirement by using 2B, but scenario 
1 cannot.

Thanks,
Beichuan

-----Original Message-----
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Rob Latham
Sent: Thursday, September 18, 2014 08:49
To: us...@open-mpi.org
Subject: Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4



On 09/17/2014 05:46 PM, Beichuan Yan wrote:
> Hi Rob,
>
> As you pointed out in April that there are many cases that could 
> arouse ADIOI_Set_lock error. My code writes to a file at a location 
> specified by a shared file pointer (it is a blocking and collective 
> call): MPI_File_write_ordered(contactFile, const_cast<char*> 
> (inf.str().c_str()), length, MPI_CHAR, &status);
>
> That is why disabling data-sieving does not work for me, even if I tested it 
> with latest openmpi-1.8.2 and gcc-4.9.1.
>
> Can I ask a question? Except that Lustre is mounted with "flock" option, is 
> there other workaround to avoid this ADIOI_Set_lock error in MPI-2 parallel 
> IO?
>

Shared file pointer operations don't get a lot of attention.

ROMIO is going to try to lock a hidden file that contains the 8 byte location 
of the shared file pointer.

Do you mix independent shared file pointer operations with ordered mode 
operations?  If not, read on for a better way to achieve ordering:

It's pretty easy to replace ordered mode operations with a collective call of 
the same behavior.  The key is to use MPI_SCAN:

           MPI_File_get_position(mpi_fh, &offset);

           MPI_Scan(&incr, &new_offset, 1, MPI_LONG_LONG_INT,
                           MPI_SUM, MPI_COMM_WORLD);
           new_offset -= incr;
           new_offset += offset;

           ret = MPI_File_write_at_all(mpi_fh, new_offset, buf, count,
                                   datatype, status);

See: every process has "incr" amount of data.  The MPI_SCAN ensures the offsets 
computed are ascending in rank order (as they would for ordered mode i/o) and 
the actual I/O happens with a much faster MPI_File_write_at_all.

We wrote this up in our 2005 shared memory for shared file pointers paper, even 
though this approach doesn't need RMA shared memory.

==rob

> Thanks,
> Beichuan
>
> -----Original Message-----
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Rob 
> Latham
> Sent: Monday, April 14, 2014 14:24
> To: Open MPI Users
> Subject: Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4
>
>
>
> On 04/08/2014 05:49 PM, Daniel Milroy wrote:
>> Hello,
>>
>> The file system in question is indeed Lustre, and mounting with flock 
>> isn't possible in our environment.  I recommended the following 
>> changes to the users' code:
>
> Hi.  I'm the ROMIO guy, though I do rely on the community to help me keep the 
> lustre driver up to snuff.
>
>> MPI_Info_set(info, "collective_buffering", "true"); 
>> MPI_Info_set(info, "romio_lustre_ds_in_coll", "disable"); 
>> MPI_Info_set(info, "romio_ds_read", "disable"); MPI_Info_set(info, 
>> "romio_ds_write", "disable");
>>
>> Which results in the same error as before.  Are there any other MPI 
>> options I can set?
>
> I'd like to hear more about the workload generating these lock messages, but 
> I can tell you the situations in which ADIOI_SetLock gets called:
> - everywhere in NFS.  If you have a Lustre file system exported to 
> some clients as NFS, you'll get NFS (er, that might not be true unless 
> you pick up a recent patch)
> - when writing a non-contiguous region in file, unless you disable data 
> sieving, as you did above.
> - note: you don't need to disable data sieving for reads, though you might 
> want to if the data sieving algorithm is wasting a lot of data.
> - if atomic mode was set on the file (i.e. you called
> MPI_File_set_atomicity)
> - if you use any of the shared file pointer operations
> - if you use any of the ordered mode collective operations
>
> you've turned off data sieving writes, which is what I would have first 
> guessed would trigger this lock message.  So I guess you are hitting one of 
> the other cases.
>
> ==rob
>
> --
> Rob Latham
> Mathematics and Computer Science Division Argonne National Lab, IL USA 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/09/25337.php
>

--
Rob Latham
Mathematics and Computer Science Division Argonne National Lab, IL USA 
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/09/25356.php

Reply via email to