Paths as sequences of path components

Mark H Weaver Mon, 23 May 2011 13:14:37 -0700

Hello all,

I really like the basic gist behind Noah's proposal, to allow programs
to optionally represent paths (roughly) as sequences of path components.
I haven't worked out all the details, and I'm glad to leave that job to
someone else, but I do have a few comments to add:


First of all, I think that the paths-as-components layer should be
_above_ the POSIX-bytestrings-as-SCM-strings layer.  In other words, the
pathnames-as-components code should represent both complete pathnames
and path components as SCM strings.

In addition, I hope that the paths-as-components layer will allow code
to conveniently manipulate paths while avoiding some of the common
security problems that can arise.  For example, a web application should
be able to easily and safely use a user-supplied string to construct a
pathname, without having to search the user-supplied string for things
like "../../../../etc/passwd".

When constructing paths from components, I think we should prevent a
single component from being interpreted by the OS as multiple
components.  In other words, we should make sure that components do not
contain path separators or other characters which are illegal in
filenames (e.g. NUL).  Either an exception should be thrown or they
should be escaped somehow.  If escaped, I think the transformation
should be bijective.

Also, I think there should be a very simple way to exclude "special"
path components such as "." from "..", in a platform-neutral way.

On the other hand, sometimes you really do need to include "." or ".."
in a path, and so it ought to be possible to include them if needed.

Apart from this, I wish to raise some questions for which I don't have
answers:

Should we provide a way to represent paths with multiple consecutive
path separators?

How should things like drive letters in DOS filenames be handled?

How should the distinction between absolute and relative paths be
handled?

Should our existing POSIX interfaces which accept pathnames be extended
to optionally accept these higher-level path objects?

     Best,
      Mark

Paths as sequences of path components

Reply via email to