Eryk Sun <eryk...@gmail.com> added the comment:

>> Unix Python resolves the executable path with repeated _Py_wreadlink 
>> calls. Windows Python should do something similar to ensure the 
>> consistency of sys.executable with realpath(sys.executable).
>
> I don't think this necessarily follows. There's nowhere in the 
> documentation that says that sys.executable is even a valid path, 
> let alone the final path.

The reason is cross-platform parity for users who aren't language lawyers -- as 
long as it's not expensive and doesn't compromise reliability or safety. 

That said, resolving the real executable path is more of a practical concern in 
Unix. In Windows it's not generally useful since the loader does not resolve 
the real path of an executable. 

Unix Python also calls _Py_wrealpath on the script path, which I think is more 
relevant in Windows than the sys.executable case because it's at a higher level 
that we control. This allows running a script symlink at the command line (e.g. 
linked in a common bin directory in PATH) even if the script depends on modules 
in the real directory.

>> I think we want relpath(realpath('C:/Temp/foo'), realpath('S:/')) to 
>> succeed as r"..\foo". I don't think we want it to fail as a cross-
>> drive relative path.
>
> Cross-drive relative paths are fine though - they are just absolute 
> paths :)

relpath() fails if the target and start directories aren't on the same drive. 
Code that's creating a symlink in Windows has to handle this case by using an 
absolute symlink instead of a relative symlink, if that's what you mean. That's 
probably for the better. So I change my mind. Forcing scripts to create 
absolute symlinks is not an issue, even if it's unnecessary because the target 
and start directory can be resolved to the same drive. The mount point should 
take precedence. But that's an argument against using the final path. Mapped 
drives and subst drives will be resolved in the final path. Reverse mapping to 
the original drive, if possible, would be extra work.

For example, say we start with "\\??\\S:\\". The object manager reparses the 
r"\??\S:" SymbolicLink as r"\??\C:\Temp\Subst". Next it reparses r"\??\C:" to a 
device object, with a resolved path such as 
r"\Device\HarddiskVolume2\Temp\Subst". The Device object type has a parse 
routine that's implemented by the I/O manager. This sends an IRP_MJ_CREATE 
request to the mounted file-system device (NTFS in this case) with the 
remaining path to be parsed, e.g. r"\Temp\Subst". Note that at this stage, 
information about the original drive "S:" is long gone.

If the file system in turn finds a reparse point, such as a file-system symlink 
or mount point, then it stops there and returns STATUS_REPARSE with the 
contents of the reparse buffer. The I/O Manager itself handles symlink and 
mount-point reparsing, for which it implements behavior that's as close as 
possible to Unix symlinks and mount points. After setting up the new path to 
open, the I/O manager's parse routine returns STATUS_REPARSE to the object 
manager. Up to 63 reparse attempts are allowed, including within the object 
namespace itself. The limit of 63 reparse attempts is a simple way to handle 
reparse loops.

Assuming no file-system reparse points, we have as the final path 
r"\Device\HarddiskVolume2\Temp\Subst". To map this back to a DOS path, 
GetFinalPathNameByHandleW queries the mount-point manager for the canonical DOS 
device name for r"\Device\HarddiskVolume2". The mount-point manager knows about 
"C:" in this case, but it doesn't have a registry of subst drives. 
GetFinalPathNameByHandleW also doesn't enumerate DOS devices and map a path 
back to a matching subst drive. It supports only the canonical path. 

Reverse mapping a UNC path to a mapped drive would be even more difficult. We 
would have to doubly resolve, for example, from "M:" -> r"\Device\<redirector 
name>\;M:<logon session>\server\share\path\to\directory" -> 
r"\Device\Mup\;<redirector name>\;M:<logon 
session>\server\share\path\to\directory". Once we confirm that "M:" targets the 
MUP device, we can compare r"\server\share\path\to\directory" to check whether 
the final path contains this path. If so it can replace it with "M:". That's a 
lot of work to get a non-canonical path, and that's the simplest case. For 
example, we could have a subst drive for a mapped drive, and the shortest path 
would be the subst drive.

To avoid resolving drives altogether, realpath() would have to manually walk 
the path instead of relying on GetFinalPathNameByHandleW.

> If we can easily tell the difference between directory junctions and 
> mapped drives, given that they are both identical types of reparse 
> points

Mapped drives and subst drives are not file-system reparse points. They're 
"DOS" devices, which are implemented as SymbolicLink objects in the "\\??\\" 
device-alias directory. A SymbolicLink object can be read via 
NtQuerySymbolicLinkObject, or via WINAPI QueryDosDeviceW if the SymbolicLink is 
in "\\??\\".

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue37993>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to