On Mon, 31 Dec 2007, Darren Reed wrote:
> Frank Hofmann wrote:
>>
>>
>> On Fri, 28 Dec 2007, Darren Reed wrote:
>> [ ... ]
>>> Is this behaviour defined by a standard (such as POSIX or the
>>> VFS design) or are we free to innovate here and do something
>>> that allowed such a shortcut as required?
>>
>> Wrt. to standards, quote from:
>>
>> http://www.opengroup.org/onlinepubs/009695399/functions/rename.html
>>
>> ERRORS
>> The rename() function shall fail if:
>> [ ... ]
>> [EXDEV]
>> [CX] The links named by new and old are on different file systems
>> and the
>> implementation does not support links between file systems.
>>
>> Hence, it's implementation-dependent, as per IEEE1003.1.
>
> This implies that we'd also have to look at allowing
> link(2) to also function between filesystems where
> rename(2) was going to work without doing a copy,
> correct? Which I suppose makes sense.
Copy-on-write. rename() is just defined as an "atomic" sequence of:
link(old, new);
unlink(old);
If cross-fs rename is possible, then cross-fs link is as well. It's
"per-file clone".
Btw, Joerg, this addresses the concern you had in any case. It's cross-fs,
that means st_dev/st_ino _WILL_ change. Persistence of open files is not
related to that. If you hold a file open, the st_dev/st_ino associated
with the open fd will stay around and continue to be accessible with
fstat() - but not necessarily with stat(). It definitely would not be in
case the file got removed. That cross-fs rename would, on the source fs,
remove the file is, for all I can see, not violating anything.
The location of the file's data is _NOT_ the only way to derive a unique
st_dev/st_ino pair.
rename() _within_ a filesystem (as defined by the set of nodes with a
common st_dev) should preserve st_ino if the fs supports link counts
larger than one, agreed. But let's not confuse this with cross-fs rename,
where by definition (cross-fs) st_dev must change. The identity of that
file, therefore, has changed.
We're just in the happy situation with ZFS that the storage low-level
implementation can know that the contents haven't.
That's a sad situation for backup utilities, by the way - a backup tool
would have no way of finding out that file X on fs A already existed as
file Z on fs B. So what ? If the file got copied, byte by byte, the same
situation exists, the contents are identical. I don't think just because
this makes backups slower than they could be if the backup utility were
omniscient, that makes a reason to slow file copy/rename operations down.
Happy new year !
FrankH.
_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss