On 09/17/2014 05:46 PM, Beichuan Yan wrote:
Hi Rob,

As you pointed out in April that there are many cases that could arouse ADIOI_Set_lock 
error. My code writes to a file at a location specified by a shared file pointer (it is a 
blocking and collective call): MPI_File_write_ordered(contactFile, 
const_cast<char*> (inf.str().c_str()), length, MPI_CHAR, &status);

That is why disabling data-sieving does not work for me, even if I tested it 
with latest openmpi-1.8.2 and gcc-4.9.1.

Can I ask a question? Except that Lustre is mounted with "flock" option, is 
there other workaround to avoid this ADIOI_Set_lock error in MPI-2 parallel IO?


Shared file pointer operations don't get a lot of attention.

ROMIO is going to try to lock a hidden file that contains the 8 byte location of the shared file pointer.

Do you mix independent shared file pointer operations with ordered mode operations? If not, read on for a better way to achieve ordering:

It's pretty easy to replace ordered mode operations with a collective call of the same behavior. The key is to use MPI_SCAN:

          MPI_File_get_position(mpi_fh, &offset);

          MPI_Scan(&incr, &new_offset, 1, MPI_LONG_LONG_INT,
                          MPI_SUM, MPI_COMM_WORLD);
          new_offset -= incr;
          new_offset += offset;

          ret = MPI_File_write_at_all(mpi_fh, new_offset, buf, count,
                                  datatype, status);

See: every process has "incr" amount of data. The MPI_SCAN ensures the offsets computed are ascending in rank order (as they would for ordered mode i/o) and the actual I/O happens with a much faster MPI_File_write_at_all.

We wrote this up in our 2005 shared memory for shared file pointers paper, even though this approach doesn't need RMA shared memory.

==rob

Thanks,
Beichuan

-----Original Message-----
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Rob Latham
Sent: Monday, April 14, 2014 14:24
To: Open MPI Users
Subject: Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4



On 04/08/2014 05:49 PM, Daniel Milroy wrote:
Hello,

The file system in question is indeed Lustre, and mounting with flock
isn't possible in our environment.  I recommended the following
changes to the users' code:

Hi.  I'm the ROMIO guy, though I do rely on the community to help me keep the 
lustre driver up to snuff.

MPI_Info_set(info, "collective_buffering", "true"); MPI_Info_set(info,
"romio_lustre_ds_in_coll", "disable"); MPI_Info_set(info,
"romio_ds_read", "disable"); MPI_Info_set(info, "romio_ds_write",
"disable");

Which results in the same error as before.  Are there any other MPI
options I can set?

I'd like to hear more about the workload generating these lock messages, but I 
can tell you the situations in which ADIOI_SetLock gets called:
- everywhere in NFS.  If you have a Lustre file system exported to some clients 
as NFS, you'll get NFS (er, that might not be true unless you pick up a recent 
patch)
- when writing a non-contiguous region in file, unless you disable data 
sieving, as you did above.
- note: you don't need to disable data sieving for reads, though you might want 
to if the data sieving algorithm is wasting a lot of data.
- if atomic mode was set on the file (i.e. you called
MPI_File_set_atomicity)
- if you use any of the shared file pointer operations
- if you use any of the ordered mode collective operations

you've turned off data sieving writes, which is what I would have first guessed 
would trigger this lock message.  So I guess you are hitting one of the other 
cases.

==rob

--
Rob Latham
Mathematics and Computer Science Division Argonne National Lab, IL USA 
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/09/25337.php


--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

Reply via email to