Hi Luke I am afraid the corruptions cannot be avoided, especially with older nfs where even locks fail... I also have problems with network interruptions ...
Dana, I have the impression tha SWMR will not be supportrd on NFS. I hope some of the upcoming API in 1.10.0 will alow for safer metadata management that will rationalise these risks Best -- dimitris 2013/8/27, Dana Robinson <[email protected]>: > Hi all, > > My answers are interspersed below. This is the first I've seen of this > thread, so I apologize if I'm missing any context here. Other HDF Group > personnel can weigh in if there's anything I'm missing. > > Dana > > From: Hdf-forum [mailto:[email protected]] On Behalf Of > Luke Campbell > Sent: Tuesday, August 27, 2013 1:18 PM > To: Dimitris Servis > Cc: HDF Users Discussion List > Subject: Re: [Hdf-forum] HDF5 Compatibility with NFS and Fault Tolerance > > We introduced locking in a recent version to assist in dealing with the > concurrent access issues but the larger concern I have is with file > consistency with a network interruption or failure. > > [Dana Robinson] At this time, the HDF5 library does not support concurrent > access when a writer is involved. In the future HDF5 1.10.0 release > (release date: TBA), we plan to include a feature that will allow concurrent > access by a single writer and any number of readers (the Single > Writer/Multiple Reader - SWMR pattern). This feature is under active > development but is not ready for production use. Let me know if you'd like > to know more about the feature and its status. > > I was debugging HDF5 at a low level and worked only on one use case: change > an existing variable length string attribute in a dataset of a HDF5 file. I > added some print statements around the write system calls and identified > that there are at a minimum four write calls that occur for this use case. > One at the superblock which updates the file's end of file address for the > new global heap structure at the bottom of the file, which is just a 4k > offset from the previous value, the next one is ambiguous to me since the > values were identical to the previous file's contents but I think it was > either in the symbol table or the b-tree, I can't recall off the top of my > head. There is a 4096 bytes written to the end of the file that contains the > new global heap. And lastly a 120 bytes written to the local heap where an > attribute is either created or updated to point to the new global heap at > the end of the file. > > The issue I see with NFS as the underlying (almost posix compliant) > filesystem is that no matter what mutable HDF operation you wish to perform, > a network interruption or application crash can result in file corruption. > > The NFS settings we're using are: > > resvport,rw,noexec,sync,wsize=32768,rsize=32768,nfsvers=3,soft,nolocallocks,intr > > I don't see a solution unless we can setup an all-or-nothing style > transaction within HDF5 and as the HDF group has already posted that HDF5 is > not transactional, I don't know how to proceed. Even if you were to somehow > tell NFS to write the entire file to disk (or in this case to NFS) POSIX > only guarantees atomicity for 512 bytes. I thought for a while that if a > write failed the client would just try to rewrite the data and that the > client would get a response for how many bytes were written successfully, > unfortunately with NFS this is not necessarily the case, and from > experimentation I have determined that it depends heavily on the client > implementation and options used. > > [Dana Robinson]The SWMR feature will introduce write ordering that will > always leave the file in a consistent, but not necessarily up-to-date state. > This will not require transactions. As for atomicity, the library will use > checksums to retry when torn (non-atomic) writes are encountered under SWMR > read conditions. This will fix the lack of write-call-level atomicity in a > file system. Write ordering at the file system level will still be > required, though. We have also been discussing a transaction feature for > HDF5, but this is at the preliminary design stage and is currently unfunded. > Contact us if you are interested in supporting it :) > Any suggestions or ideas on the combinations of options for both NFS and HDF > that would eliminate the possibility of file corruption would be greatly > appreciated! > > > Hi Luke > > Do you lock the files? What nfs version? > > Regards > > Dimitris > > Luke Campbell > > _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
