Eryk Sun <eryk...@gmail.com> added the comment:

I'm trying to give os.link() and follow_symlinks the benefit of the doubt, but 
the implementation just seems buggy to me. 

POSIX says that "[i]f path1 names a symbolic link, it is implementation-defined 
whether link() follows the symbolic link, or creates a new link to the symbolic 
link itself" [1]. In Linux, link() does not follow symlinks. One has to call 
linkat() with AT_SYMLINK_FOLLOW:

    AT_SYMLINK_FOLLOW (since Linux 2.6.18)
        By default, linkat(), does not dereference oldpath if it is a 
        symbolic link (like link()). The flag AT_SYMLINK_FOLLOW can be
        specified in flags to cause oldpath to be dereferenced if it is
        a symbolic link. 

The behavior is apparently the same in FreeBSD [2]. 

Thus the following implementation in os.link() seems buggy.

#ifdef HAVE_LINKAT
    if ((src_dir_fd != DEFAULT_DIR_FD) ||
        (dst_dir_fd != DEFAULT_DIR_FD) ||
        (!follow_symlinks))
        result = linkat(src_dir_fd, src->narrow,
            dst_dir_fd, dst->narrow,
            follow_symlinks ? AT_SYMLINK_FOLLOW : 0);
    else
#endif /* HAVE_LINKAT */

The only way that the value of follow_symlinks matters in Linux is if 
src_dir_fd or dst_dir_fd is used with a real file descriptor (i.e. not 
DEFAULT_DIR_FD, which is AT_FDCWD). Otherwise, the default True value of 
follow_symlinks is an outright lie. For example:

    >>> os.link in os.supports_follow_symlinks
    True
    >>> open('spam', 'w').close()
    >>> os.symlink('spam', 'spamlink1')
    >>> os.link('spamlink1', 'spamlink2')

spamlink2 was created as a hardlink to spamlink1, not its target, i.e. it's a 
symlink:
 
    >>> os.lstat('spamlink1').st_ino == os.lstat('spamlink2').st_ino
    True
    >>> os.readlink('spamlink2')
    'spam'

In contrast, if src_dir_fd is passed, then follow_symlinks=True is implemented 
as advertised (via AT_SYMLINK_FOLLOW):

    >>> fd = os.open('.', 0)
    >>> os.link('spamlink1', 'spamlink3', src_dir_fd=fd)

spamlink3 was created as a hardlink to spam, the target of spamlink1:
  
    >>> os.lstat('spam').st_ino == os.lstat('spamlink3').st_ino
    True

That the value of an unrelated parameter -- src_dir_fd -- changes the behavior 
of the follow_symlinks parameter is obviously a bug that should be addressed.

POSIX mandates that "[i]f both fd1 and fd2 have value AT_FDCWD, the behavior 
shall be identical to a call to link(), except that symbolic links shall be 
handled as specified by the value of flag". It's already using AT_FDCWD as a 
default value, so the implementation of os.link() should just unconditionally 
call linkat() if it's available. Then the value of follow_symlinks, true or 
false, will be honored, with or without passing src_dir_fd or dst_dir_fd.

That said, since os.link() hasn't been working as advertised, this change needs 
to be accompanied by changing the default value of follow_symlinks to False. 
That will retain the status quo behavior for most systems, except in the rare 
case that src_dir_fd or dst_dir_fd is used. If it isn't changed to False, then 
suddenly os.link() calls will start following symlinks, whereas prior to the 
change they did not because link() was being called instead of linkat(). 

--- 

In Windows, CreateHardLinkW [3] is incorrectly documented as following symlinks 
(i.e. "[i]f the path points to a symbolic link, the function creates a hard 
link to the target"). Actually, it opens the file to be hard-linked with the 
NTAPI option FILE_OPEN_REPARSE_POINT (same as WinAPI 
FILE_FLAG_OPEN_REPARSE_POINT). Thus no type of reparse point is followed, 
including symlinks.

---

[1]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/link.html
[2]: https://www.unix.com/man-page/FreeBSD/2/link
[3]: 
https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-createhardlinkw

----------
nosy: +eryksun

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue41355>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to