On Thu, 07 Jun 2018 10:04:53 +0200, Antoon Pardon wrote: > On 07-06-18 05:55, Steven D'Aprano wrote: >> Python strings are rich objects which support the Unicode code point \0 >> in them. The limitation of the Linux kernel that it relies on NULL- >> terminated byte strings is irrelevant to the question of what >> os.path.exists ought to do when given a path containing NUL. Other >> invalid path names return False. > > It is not irrelevant. It makes the disctinction clear between possible > values and impossible values.
That is simply wrong. It is wrong in principle, and it is wrong in practice, for reasons already covered to death in this thread. It is *wrong in practice* because other impossible values don't raise ValueError, they simply return False: - illegal pathnames under Windows, those containing special characters like ? > < * etc, simply return False; - even on Linux, illegal pathnames like "" (the empty string) return False; - invalid pathnames with too many path components, or too many characters in a single component, simply return False; - the os.path.exists() function is not documented as making a three-way split between "exists, doesn't exist and invalid"; - and it isn't even true to say that NULL is illegal in pathnames: there are at least five file systems that allow either NUL bytes: FAT-8, MFS, HFS, or Unicode \0 code points: HFS Plus and Apple File System. And it is *wrong in principle* because in the most general case, there is no way to tell which pathnames are valid or invalid without querying an actual file system. In the case of Linux, any directory could be used as a mount point. Is "/mnt/some?file" valid or invalid? If an NTFS file system is mounted on /mnt, it is invalid; if an ext4 file system is mounted there, it is valid; if there's nothing mounted there, the question is impossible to answer. >> As a Python programmer, how does treating NUL specially make our life >> better? > > By treating possible path values differently from impossible path > values. But it doesn't do that. "Pathnames cannot contain NUL" is a falsehood that programmers wrongly believe about paths. HFS Plus and Apple File System support NULs in paths. So what it does is wrongly single out one *POSSIBLE* path value to raise an exception, while other so-called "impossible" path values simply return False. But in the spirit of compromise, okay, let's ignore the existence of file systems like HFS which allow NUL. Apart from Mac users, who uses them anyway? Let's pretend that every file system in existence, now and into the future, will prohibit NULs in paths. Have you ever actually used this feature? When was the last time you wrote code like this? try: flag = os.path.exists(pathname) except ValueError: handle_null_in_path() else: if flag: handle_file() else: handle_invalid_path_or_no_such_file() I want to see actual, real code used in production, not made up code snippets, that demonstrate that this is a useful distinction to make. Until such time that somebody shows me an actual real-world use-case for wanting to make this distinction for NULs and NULs alone, I call bullshit. -- Steven D'Aprano "Ever since I learned about confirmation bias, I've been seeing it everywhere." -- Jon Ronson -- https://mail.python.org/mailman/listinfo/python-list