[Python-Dev] pathlib handling of trailing slash (Issue #21039)

2014-08-06 Thread Isaac Schwabacher
pathlib.Path currently strips trailing slashes from pathnames, but this 
behavior contradicts POSIX 
(http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_12),
 which specifies that the resolution of the pathname of a symbolic link to a 
directory in the context of a function that operates on symbolic links shall 
depend on whether the pathname has a trailing slash:

> 4.12 Pathname Resolution
> 
>
> [...]
>
> A pathname that contains at least one non-  character and that ends 
> with one or more trailing  characters shall not be resolved 
> successfully unless the last pathname component before the trailing  
> characters names an existing directory or a directory entry that is to be 
> created for a directory immediately after the pathname is resolved. 
> Interfaces using pathname resolution may specify additional constraints[1] 
> when a pathname that does not name an existing directory contains at least 
> one non-  character and contains one or more trailing  
> characters.
>
> If a symbolic link is encountered during pathname resolution, the behavior 
> shall depend on whether the pathname component is at the end of the pathname 
> and on the function being performed. If all of the following are true, then 
> pathname resolution is complete:
>
> 1. This is the last pathname component of the pathname.
> 2. The pathname has no trailing .
> 3. The function is required to act on the symbolic link itself, or certain 
> arguments direct that the function act on the symbolic link itself.
>
> In all other cases, the system shall prefix the remaining pathname, if any, 
> with the contents of the symbolic link. [...]

The following sentence appeared in an earlier version of POSIX 
(http://pubs.opengroup.org/onlinepubs/009604499/basedefs/xbd_chap04.html#tag_04_11)
 but has since been removed:

> A pathname that contains at least one non-slash character and that ends with 
> one or more trailing slashes shall be resolved as if a single dot character ( 
> '.' ) were appended to the pathname.

Is this important enough to preserve trailing slashes?

- Isaac Schwabacher
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib handling of trailing slash (Issue #21039)

2014-08-06 Thread Antoine Pitrou


Le 06/08/2014 18:36, Isaac Schwabacher a écrit :


If a symbolic link is encountered during pathname resolution, the
behavior shall depend on whether the pathname component is at the
end of the pathname and on the function being performed. If all of
the following are true, then pathname resolution is complete:

1. This is the last pathname component of the pathname. 2. The
pathname has no trailing . 3. The function is required to
act on the symbolic link itself, or certain arguments direct that
the function act on the symbolic link itself.

In all other cases, the system shall prefix the remaining pathname,
if any, with the contents of the symbolic link. [...]


So the only case where this would make a difference is when calling a 
"function acting on the symbolic link itself" (such as lstat() or 
unlink()) on a path with a trailing slash:


>>> os.lstat('foo')
os.stat_result(st_mode=41471, st_ino=1981954, st_dev=2050, st_nlink=1, 
st_uid=1000, st_gid=1000, st_size=4, st_atime=1407370025, 
st_mtime=1407370025, st_ctime=1407370025)

>>> os.lstat('foo/')
os.stat_result(st_mode=17407, st_ino=917505, st_dev=2050, st_nlink=7, 
st_uid=0, st_gid=0, st_size=4096, st_atime=1407367916, 
st_mtime=1407369857, st_ctime=1407369857)


>>> pathlib.Path('foo').lstat()
os.stat_result(st_mode=41471, st_ino=1981954, st_dev=2050, st_nlink=1, 
st_uid=1000, st_gid=1000, st_size=4, st_atime=1407370037, 
st_mtime=1407370025, st_ctime=1407370025)

>>> pathlib.Path('foo/').lstat()
os.stat_result(st_mode=41471, st_ino=1981954, st_dev=2050, st_nlink=1, 
st_uid=1000, st_gid=1000, st_size=4, st_atime=1407370037, 
st_mtime=1407370025, st_ctime=1407370025)


But you can also call resolve() explicitly if you want to act on the 
link target rather than the link itself:


>>> pathlib.Path('foo/').resolve().lstat()
os.stat_result(st_mode=17407, st_ino=917505, st_dev=2050, st_nlink=7, 
st_uid=0, st_gid=0, st_size=4096, st_atime=1407367916, 
st_mtime=1407369857, st_ctime=1407369857)


Am I overlooking other cases?

Regards

Antoine.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib handling of trailing slash (Issue #21039)

2014-08-06 Thread Alexander Belopolsky
On Wed, Aug 6, 2014 at 8:11 PM, Antoine Pitrou  wrote:

> Am I overlooking other cases?


There are many interfaces where trailing slash is significant.  For
example, rsync uses trailing slash on the target directory to avoid
creating an additional directory level at the destination.  Loosing it when
passing path strings through pathlib.Path() may be a source of bugs.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib handling of trailing slash (Issue #21039)

2014-08-06 Thread Antoine Pitrou

Le 06/08/2014 20:50, Alexander Belopolsky a écrit :

On Wed, Aug 6, 2014 at 8:11 PM, Antoine Pitrou mailto:[email protected]>> wrote:

Am I overlooking other cases?

There are many interfaces where trailing slash is significant.  For
example, rsync uses trailing slash on the target directory to avoid
creating an additional directory level at the destination.  Loosing it
when passing path strings through pathlib.Path() may be a source of bugs.


pathlib is generally concerned with filesystem operations written in 
Python, not arbitrary third-party tools. Also it is probably easy to 
append the trailing slash in your command-line invocation, if so desired.


Regards

Antoine.



___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib handling of trailing slash (Issue #21039)

2014-08-06 Thread Ben Finney
Antoine Pitrou  writes:

> Le 06/08/2014 20:50, Alexander Belopolsky a écrit :
> > There are many interfaces where trailing slash is significant. […]
> > Loosing it when passing path strings through pathlib.Path() may be a
> > source of bugs.
>
> pathlib is generally concerned with filesystem operations written in
> Python, not arbitrary third-party tools.

The operating system shell is more than an “arbitrary third-party tool”,
though; it preserves paths, and handles invoking commands.

You seem to be saying that ‘pathlib’ is not intended to be helpful for
constructing a shell command. Will its documentation warn that is so?

> Also it is probably easy to append the trailing slash in your
> command-line invocation, if so desired.

The trouble is that one can desire it, and construct a path knowing that
the presence or absence of a trailing slash has semantic significance;
and then have it unaccountably altered by the pathlib.Path code. This is
worse than preserving the semantic value.

-- 
 \   “But Marge, what if we chose the wrong religion? Each week we |
  `\  just make God madder and madder.” —Homer, _The Simpsons_ |
_o__)  |
Ben Finney

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib handling of trailing slash (Issue #21039)

2014-08-06 Thread Antoine Pitrou


Le 06/08/2014 22:12, Ben Finney a écrit :

You seem to be saying that ‘pathlib’ is not intended to be helpful for
constructing a shell command.


pathlib lets you do operations on paths. It also gives you a string 
representation of the path that's expected to designate that path when 
talking to operating system APIs. It doesn't give you the possibility to 
store other semantic variations ("whether a new directory level must be 
created"); that's up to you to add those.


(similarly, it doesn't have separate classes to represent "a file", "a 
directory", "a non-existing file", etc.)


Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com