On 27/05/20 5:23 AM, BlindAnagram wrote:
On 26/05/2020 16:59, Mats Wichmann wrote:
On 5/26/20 8:56 AM, BlindAnagram wrote:
I came across an issue that I am wondering whether I should report as an
issue.  If I have a directory, say:

   base='C:\\Documents'

and I use os.path.join() as follows:

   join(base, '..\\..\\', 'build', '')

I obtain as expected from the documentation:

'C:\\Documents\\..\\..\\build\\'

But if I try to make the directory myself (as I tried first):

   join(base, '..\\..\\', 'build', '\\')

I obtain:

'C:\\'

The documentation says that an absolute path in the parameter list for
join will discard all previous parameters but '\\' is not an absoute path!

But it is - an absolute path is one that starts with the pathname separator.

In a string giving a file path on Windows '\\' is recognised as a
separator between directories and not as an indicator that what follows
is an absolute path based on the drive letter (although it might, as you
say, imply a drive context).

[some of this answer may appear 'obvious' to you. If so, please understand that this conversation has the side-benefit of assisting other readers to understand Python, and that I would not presume to 'talk down' to you-personally]


Using the docs:

<<<
os.path.join(path, *paths)
Join one or more path components intelligently. The return value is the concatenation of path and any members of *paths with exactly one directory separator (os.sep) following each non-empty part except the last, meaning that the result will only end in a separator if the last part is empty. If a component is an absolute path, all previous components are thrown away and joining continues from the absolute path component.

On Windows... [previously discussed]
>>>
https://docs.python.org/3/library/os.path.html

Let's start with the word "intelligently". Some might assume this to mean that it will distinguish between "separator between directories" and "absolute path". However, what it means is that it will select either the POSIX or the MS-Windows character(s) - depending upon whether the final-application is running on your machine or mine! It also means, that it expects to handle the assembly of the parameters into a single path (utilising the appropriate separator).

Please be advised that the pathlib library and pathlike interface were added quite recently, and largely because the os library is considered dated. Accordingly, please don't attempt to draw parallels or 'rules' by comparing the under-pinning philosophies of 'past' with 'future'.

Remember that Python does not define files, paths, directories (folders), and backing-store structures; and as observed, they differ between OpSys. The os and os.path libraries exist to help us (poor, long-suffering coders) to cope with the differences. Accordingly, in Python, we do not deal with the file system itself, but we code to an abstraction of a file system! Python's interpreter handles 'the real situation' at run-time. (thank you Python!)

Please review the os library (https://docs.python.org/3/library/os.html). There (amongst other very useful facilities) you will find such as os.sep (and various other os.*seps which illustrate how difficult it is to harmonise the abstraction to cope with the various realities). Note also, the warning (which applies both to 'construction' and 'separation' of paths from path-components).

Further reading? Because Python doesn't really define "path", let's turn to https://en.wikipedia.org/wiki/Path_%28computing%29 - but keep a headache remedy to-hand! This article provides such understandings as "path", "root", and "device" (the latter not existing in POSIX systems), per a range of operating systems.


OK, after all that, back to the question:-

Please examine the 'signature' of -join():

        os.path.join(path, *paths)

notice that the arguments are path[s] - NOT file-names, NOT directories (folders), and NOT path-components. Remember also the word "intelligent".

The objective of the function is to create a legal AND OpSys-appropriate path, by joining other *path(s)* together. Accordingly, the function considers each parameter to be a path. A path commencing with the symbol indicating the "root" is considered an "absolute path". A path commencing with a character (etc) is considered a "relative path". [Apologies, in that experienced pythonista will find this 'stating the obvious', but learners often do not find such differences, immediately apparent]

This may explain why the OP's use of, or interpretation of, arguments to the function, differs from that of the library.


Why a subsequent parameter, interpreted as an absolute-path, should cause all previous parameters to be 'thrown away' is an implementation detail - and I can't explain that choice, except to say that because some systems use the same character to represent the "root" directory as they do for the path-component separator, there are situations where the two could be confused - whether this happens on MS-Windows (or not) is besides the point when dealing with the Python file-system 'abstraction' functions!

IMHO: the best way to use -join() is not to mix its 'intelligence' with (OpSys-specific) string-literal separator characters of my own.
(even though I am (so much) smarter than it. Hah!)


The concept of paths is ugly in Windows because of the drive letter - a
drive letter is not actually part of a path, it's an additional piece of
context.  If you leave out the drive letter, your path is relative or
absolute within the current drive letter; if you include it your path is
relative or absolute within the specified drive letter.  So Python has
behaved as documented here: the indicator for an absolute path has
discarded everything (except the drive letter, which is necessary to
maintain the context you provided) which came before it in the join.

This is not consistent with how other file management functions in
os.path operate since they willingly accept '\\' as a directory separator.

Back to the idea of an 'abstraction'. Please realise that sometimes libraries offer 'helper functions' or seek to be "accepting"/forgiving in accepting argument-data. However, this does not (necessarily) imply a "rule". Another "implementation detail"?
(this time in your favor/to your liking)


If indeed you're seeking a path that is terminated by the separator
character, you need to do what you did in the first example - join an
empty string at the end (this is documented).  The terminating separator
_usually_ isn't needed.  Sadly, sometimes it appears to be...

Another 'implementation detail' which copes with 'edge cases'. This one has caught me too!
[so, not that 'intelligent' after all? (joke)]
--
Regards =dn
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to