On 06/11/2013 10:36 PM, Jared Still wrote: > Investigating an error in backup script I find that the use of dd parameter > iflag=direct is giving incorrect results in some circumstances. > > --------------------------------------------------------------------------------------------- > coreutils: > oracle@dbs-1> dd --version > dd (coreutils) 6.3 > Copyright (C) 2006 Free Software Foundation, Inc. > This is free software. You may redistribute copies of it under the terms of > the GNU General Public License <http://www.gnu.org/licenses/gpl.html>. > There is NO WARRANTY, to the extent permitted by law. > > Written by Paul Rubin, David MacKenzie, and Stuart Kemp. > --------------------------------------------------------------------------------------------- > > I realize this is an old version, but there is little I can do about that. > > The situation: > A file is created and the md5sum returned: > > origsum=`dd bs=1048576 if=source_file 2>/dev/null| tee destfile | md5sum ` > destsum=`dd if=destfile bs=1048576 iflag=direct 2>/dev/null| md5sum > > Source and Destination are both NFS V3, separate mount points. > > When used with iflag=direct on the destination file, the md5sums do not > match. > When iflag=direct is removed, the md5sum is correct. > > This may or may not be an issue with dd. > > Have there been any bugs reported for something similar to this? > I have not ruled out other issues, nfs, etc, but this so far has only been > seen with dd. > > This was also duplicated without md5sum in the command line. > That is, local files were created with and without iflag=direct. > > The md5sums do not match these files, with the same sums as seen when piped > to md5sum. > > A hex dump of each file was created, and a diff taken. > > Here is a sample of the diffs if it is of any value: > > < 217134000 9d00 0000 3205 0400 36c2 0055 c305 4c3b >> 217134000 9d00 0000 c305 543b 0032 c204 5536 0500 > -- ------------------- > < 217134020 001d c204 5536 0800 7078 0c07 0111 0001 >> 217134020 3bc3 1d4c 0400 36c2 0055 7808 0770 110c > -- ------------------- > < 217134040 3402 0600 8110 32c3 6e00 0000 c305 543b >> 217134040 0101 0200 0034 1006 c381 0032 006e 0500 > -- ------------------- > < 217134060 0032 c204 5536 0500 3bc3 1d4c 0400 36c2 >> 217134060 3bc3 3254 0400 36c2 0055 c305 4c3b 001d > > > Thanks for reading this far, >
I doubt dd is at issue here. There was an old kernel issue with O_DIRECT returning invalid data from sparse files, though that doesn't seem to be the case above. This only is an issue over NFS. NFS doesn't pass O_DIRECT to the server, but perhaps it's triggering some bug in the client? Is the corruption always 37,533,700 bytes in? thanks, Pádraig.
