On Sun, Feb 12, 2017 at 4:29 AM, Chris Angelico <ros...@gmail.com> wrote: > Registry subkeys aren't paths, and the other two cases are extremely > narrow. Convert slashes to backslashes ONLY in the cases where you > actually need to.
\\?\ paths are required to exceed MAX_PATH (a paltry 260 characters) or to avoid quirks of DOS paths (e.g. DOS device names, stripping trailing dots or spaces). Some programs require backslashes in paths passed on the command line -- e.g. more.com, but it works if the path is quoted; obviously reg.exe for registry paths (they are paths); findstr.exe (a grep-like program); mountvol.exe needs a \\?\ Volume GUID path; and running "C:/Windows/system32/cmd.exe" parses "/cmd.exe" in its own name as "/c md.exe", which is more fun if you have an "md.exe" in PATH. In cases where you can use slash in the filesystem API, Windows is doing the conversion for you by rewriting a copy of the path with slash replaced by backslash -- among other normalizations. I'm not saying that relying on the Windows base API to do this work for you is bad, just as a rule it's simpler to call normpath on path literals because you don't have to worry about edge cases like remembering to normalize the path before passing it as a command-line argument. If you're using pathlib, it already does this for you automatically: >>> p = pathlib.Path('spam/eggs') >>> os.fspath(p) 'spam\\eggs' There's been a significant effort to make pathlib interoperate with the rest of the standard library in 3.6. The point above registry subkeys inspires me to stray into Windows internals stuff, so everyone can stop reading at this point... Of course subkeys are relative paths. They're just not file-system paths. Paths of named object types (e.g. Device, File, Key, Section, Event, WindowStation, etc) are rooted in a single object namespace. The only path separator in this namespace is backslash. Forward slash is handled as a regular name character, except file systems in particular reserve it for the sake of POSIX and DOS compatibility. Here's a broad overview of what the object manager's ObOpenObjectByName function does (the "Ob" prefix is for the object manager), which gets called by system services such as NtOpenFile and NtOpenKeyEx to open a named object. The object manager implements Directory and SymbolicLink objects, so it's the first system to parse a path, starting with the root directory. It continues parsing path elements until it reaches an object type that's managed by another system. Then it passes control to that object's ParseProcedure. For a Key this is CmpParseKey (the "Cm" prefix is for the configuration manager). For a Device such as a disk volume, it's IopParseDevice (the "Io" prefix is for the I/O manager). Assuming there isn't an object-type mismatch (e.g. calling NtOpenFile on a registry key) and the object is successfully created or referenced, then a handle created in the calling process handle table is returned to the system service, which returns it to the user-mode caller. For example, parsing "C:/Program Files/Python36" first gets rewritten by the runtime library as "\??\C:\Program Files\Python36" (consider this a raw string, please). The object manager first parses "\??C:" by looking for it as "\Sessions\0\DosDevices\[LogonSessionId]\C:" That's unlikely for the C: drive (though possible). Next it checks for "\Global??\C:". That should be a symbolic link to something like "\Device\HarddiskVolume2". Next it calls the parse procedure for this device object, IopParseDevice, which sees that this is a volume device that's handled by a file-system driver. Say the context of this parse is in the middle of an NtOpenFile call. In this case, the I/O manager creates a File object (which references the remaining path "\Program Files\Python36") and an I/O request packet (IRP) to be serviced by the file-system device stack. If the file system supports reparse points, such as NTFS junctions and symbolic links, the IRP might be completed with a STATUS_REPARSE code and a new path to parse. Finally, if the open succeeds, the object manager creates a handle for the File object in the process handle table, and the handle value is returned to the caller. Now consider calling NtOpenKeyEx to open a registry key. The master registry hive has a root key named "\Registry", and two commonly used subkeys "\Registry\Machine" and "\Registry\Users". We typically reference the latter two keys via the pseudo-handles HKEY_LOCAL_MACHINE (HKLM) and HKEY_USERS (HKU) -- because these also work when accessing a remote registry over RPC. Say we're trying to open "HKLM\Software\Python\PythonCore". The real local path is "\Registry\Machine\Software\Python\PythonCore". The first thing to do is open and cache the real handle for the HKLM pseudo-handle, by opening "\Registry\Machine". NtOpenKeyEx calls ObOpenObjectByName, and the object manager begins parsing the path. It hands off parsing to the ParseProcedure of the "\Registry" object, CmpParseKey, which returns a pointer reference to the "\Registry\Machine" key object. The object manager creates a handle for the object in the process handle table, and NtOpenKeyEx returns this handle to the caller. It's little known, but the registry also supports symbolic links, so CmpParseKey may return STATUS_REPARSE with a new path for the object manager to parse. Next it does a relative open on the path "Software\Python\PythonCore" using the "\Registry\Machine" handle as the RootDirectory for the ObjectAttributes of the open. A relative open for a disk volume works the same way (e.g. opening a file relative to a handle for the working directory). The interesting thing about the documented registry API is that it exposes this native ability to open relative to a handle (like Unix *at system calls). Similar functionality could be supported in CreateFile by extending the sized SECURITY_ATTRIBUTES structure to add a RootDirectory field. As is you have to call NtCreateFile or NtOpenFile to get this functionality, which isn't supported. Let's check this out in the debugger. First Windows opens "\Registry\Machine" to cache the real handle for its HKLM pseudo-handle. Breakpoint 0 hit ntdll!NtOpenKeyEx: 00007ffd`b29b82d0 4c8bd1 mov r10,rcx 0:000> !obja @r8 Obja +000000913bfef878 at 000000913bfef878: Name is \REGISTRY\MACHINE OBJ_CASE_INSENSITIVE 0:000> r rcx rcx=000000913bfef838 0:000> pt ntdll!NtOpenKeyEx+0x14: 00007ffd`b29b82e4 c3 ret The handle returned is 0x70: 0:000> dq 913bfef838 l1 00000091`3bfef838 00000000`00000070 0:000> g Next it opens the relative path "Software\Python\PythonCore". Breakpoint 0 hit ntdll!NtOpenKeyEx: 00007ffd`b29b82d0 4c8bd1 mov r10,rcx 0:000> !obja @r8 Obja +000000913bfef6d0 at 000000913bfef6d0: Name is Software\Python\PythonCore OBJ_CASE_INSENSITIVE The RootDirectory field is 0x70, the handle for "\Registry\Machine", as we can easily see in the kernel debugger when looking at the above address (0x913bfef6d0): lkd> ?? (nt!_OBJECT_ATTRIBUTES *)0x913bfef6d0 struct _OBJECT_ATTRIBUTES * 0x00000091`3bfef6d0 +0x000 Length : 0x30 +0x008 RootDirectory : 0x00000000`00000070 Void +0x010 ObjectName : 0x00000091`3bfef978 _UNICODE_STRING "Software\Python\PythonCore" +0x018 Attributes : 0x40 +0x020 SecurityDescriptor : (null) +0x028 SecurityQualityOfService : (null) lkd> !handle 0x70 3 0070: Object: ffffe00a8ddfbf70 GrantedAccess: 000f003f (Audit) ... Name: \REGISTRY\MACHINE To close this discussion out, here's another problem involving slash in named objects. The Windows base API creates many per-session objects in a "BaseNamedObjects" directory located at "\Sessions\[SessionId]\BaseNamedObjects" or globally in "\BaseNamedObjects". It's a dumping ground for objects that don't have a better place to call home. Within the session directory there's a "Global" symbolic link to the system-wide "\BaseNamedObjects" directory. When creating a named object, you can use that link to name the object globally for all sessions. For example, creating a shared-memory section named "Global\MySharedMemory" actually creates "\BaseNamedObjects\MySharedMemory". But if you accidentally write "Global/MySharedMemory", you'll instead create an object named with a literal slash in the local session's BaseNamedObjects. I've seen this problem before in a Stack Overflow question. People get lulled into a false belief that the Windows API will handle forward slashes as path separators in anything that's pathlike (and indeed is actually implemented as a relative path under the hood), but that's only the file-system API. -- https://mail.python.org/mailman/listinfo/python-list