Hi Rob, The applications of the two users in question are different; I haven¹t looked through much of either code. I can respond to your highlighted situations in sequence:
>- everywhere in NFS. If you have a Lustre file system exported to some >clients as NFS, you'll get NFS (er, that might not be true unless you >pick up a recent patch) The compute nodes are Lustre clients mounting the file system via IB. >- note: you don't need to disable data sieving for reads, though you >might want to if the data sieving algorithm is wasting a lot of data. That¹s good to know, though given the applications I can¹t say whether data sieving is wasting data. >- if atomic mode was set on the file (i.e. you called >MPI_File_set_atomicity) >- if you use any of the shared file pointer operations >- if you use any of the ordered mode collective operations I don¹t know but will pass these questions on to the users. Thank you, Dan Milroy On 4/14/14, 2:23 PM, "Rob Latham" <r...@mcs.anl.gov> wrote: > > >On 04/08/2014 05:49 PM, Daniel Milroy wrote: >> Hello, >> >> The file system in question is indeed Lustre, and mounting with flock >> isn¹t possible in our environment. I recommended the following changes >> to the users¹ code: > >Hi. I'm the ROMIO guy, though I do rely on the community to help me >keep the lustre driver up to snuff. > >> MPI_Info_set(info, "collective_buffering", "true"); >> MPI_Info_set(info, "romio_lustre_ds_in_coll", "disable"); >> MPI_Info_set(info, "romio_ds_read", "disable"); >> MPI_Info_set(info, "romio_ds_write", "disable"); >> >> Which results in the same error as before. Are there any other MPI >> options I can set? > >I'd like to hear more about the workload generating these lock messages, >but I can tell you the situations in which ADIOI_SetLock gets called: >- everywhere in NFS. If you have a Lustre file system exported to some >clients as NFS, you'll get NFS (er, that might not be true unless you >pick up a recent patch) >- when writing a non-contiguous region in file, unless you disable data >sieving, as you did above. >- note: you don't need to disable data sieving for reads, though you >might want to if the data sieving algorithm is wasting a lot of data. >- if atomic mode was set on the file (i.e. you called >MPI_File_set_atomicity) >- if you use any of the shared file pointer operations >- if you use any of the ordered mode collective operations > >you've turned off data sieving writes, which is what I would have first >guessed would trigger this lock message. So I guess you are hitting one >of the other cases. > >==rob > >-- >Rob Latham >Mathematics and Computer Science Division >Argonne National Lab, IL USA >_______________________________________________ >users mailing list >us...@open-mpi.org >http://www.open-mpi.org/mailman/listinfo.cgi/users