This and other RFCs are available on the web at
http://dev.perl.org/rfc/
=head1 TITLE
A Sandboxing mechanism for Perl 6
=head1 VERSION
Maintainer: Matthew Byng-Maddick <[EMAIL PROTECTED]>
Date: 30 Sep 2000
Mailing List: [EMAIL PROTECTED]
Number: 353
Version: 1
Status: Developing
Co-Maintainer: Michael Schwern <[EMAIL PROTECTED]>
=head1 ABSTRACT
The current Safe mechanism in Perl5 implements its tricks at opcode
level; this RFC proposes that some sandboxing features are needed by
the perl 6 runtime environment that stop untrusted code being allowed
to access/modify things it shouldn't. This comes from untrusted Perl
often being run as root with no good reason, and a need for root users
to maintain differing levels of trust with different modules.
This is not a replacement for C<Safe>, merely a complement to it. There is
some overlap in between the two, but for the most part, C<Safe> doesn't
give the granularity that is often needed. Some of this can only happen
at perl interpreter level.
=head1 DESCRIPTION
This RFC attempts to deal with the practical problems of running untrusted
code within perl as a user with a high level of privilege. In it, a new
set of restrictions to certain interactions with the Operating System are
suggested, in several domains.
Various levels of protection are proposed.
=head2 Filesystem Interaction
This would restrict any filesystem calls. Possible trivial options are
off, tainted, on, although any combination of subdirectories could be
allowed, although C<stat()> calls would be needed to check for the lack
of symlinks to something outside the tree in the path. Any restrictions
at all would taint the data read in.
In a subdirectory, one might want to allow read access, but not write
access. One might also want to specify that any writing is opened with
O_EXCL, in order that it behaves like a C<set -C> in bash.
=head2 INET Socket Interaction
This would be similar to the filesystem, in that any kind of restriction
would cause the data to be tainted. The restriction could also be on what
ports the program is allowed to bind to (and which networks it will accept
connections from) , and what ports and networks it may open sockets to at
it's remote end.
=head2 UNIX Domain Sockets
These would be basically subject to the Filesystem constraints outlined
above, but with their own possible restrictions over and above.
=head2 Memory and Resource Usage
Most of this can be done with something like C<ulimit()> although it would
be good if Perl kept its own accounting of this, as it would enable Perl
to raise the exception, as opposed to the system just killing the process,
something which could be a problem on systems without this kind of
resource management.
=head2 Allow exec() and eval() calls
Obviously these will be subject to the tainted data mechanism outlined
above, but you might want to restrict them altogether, in which case the
trapping at the language level is useful.
This RFC proposes that the standard tainting mechanism be used for the
handling of this data, but with similar exceptions being raised when, for
example, you try to read on a socket outside your allowed network range
(here read has been chosen because you may want to do your own processing
with the peer name).
In the case of an C<eval STATEMENT> call (I'm not sure whether block ones
should count), the same sandboxing restrictions would apply.
=head1 IMPLEMENTATION
A few issues are raised by this.
The security exceptions should be raised by C<die()> in the normal way.
The simple off/on implementation can be made to be fast. A complex set of
rules will be slower, but this is considered an acceptable price to pay
for the security.
This B<will> allow you to shoot yourself in the foot, and having thought
about it, I'm not sure there is much that you can do.
The sandbox should be locally scoped to its enclosing block (like the
behaviour of the current C<local>), allowing it to become more restrictive
within the scope. A possibility is that within the same block (rather
than subblocks) the sandbox can become less restrictive, so you can close
down the box, C<require> your untrusted code, and then open it a bit.
The one exception to this is that modules in perl's own lib directory
(tainting the C<PERL6LIB> (?) environment variable) might be allowed to
run with full privilege, as these have already been checked, and will
generally not be untrusted. This is a problem, as it is possible then to
implement a denial of resources attack, using the perl6 own modules. The
author will try and think of a way round this. :)
=head2 Filesystem
This is OK if it is just on/off, as all operations are then either
accepted or denied, and all filehandles could be tainted under the correct
circumstances. The allowed directories version of this will almost
certainly be more sophisticated, having to recurse through the path, and
either (i) following symlinks, (ii) ignoring them or (iii) throwing an
exception.
=head2 Sockets
This is again similar. The on/off checks are way more trivial, although
port range checks are perhaps easier than the filesystem checking.
=head2 Resource usage
This could be implemented either entirely within Perl6, but this carries
all of the overheads of profiling code, or it could be implemented using
C<ulimit()> on systems which have it, and catching the signal itself (I
haven't been able to find any references of what signal this is...)
=head2 exec() type calls
This is just a case of on-off, although one could allow certain programs
to be run with this (much like sudo or similar), in the spirit of perl
flexibility.
=head2 eval STATEMENT calls
These could be restricted altogether, but shoudl also be subject to the
same restrictions as their containing blocks. Thus a malicious program
shouldn't be able to do anything via an eval() that it couldn't already
do..
=head2 Interface to the Language
Interface to the language has been thought about, but the author does not
feel qualified to be authoritative on this point.
The syntax should be something like the current C<use pragma>
directives, possibly something like:
use sandbox 'fs' (. => ALLOW_SUBDIRS | ALLOW_READ |
ALLOW_READ | ALLOW_CLOBBER);
How this is actually implemented internally could be something like the
$^H variable for C<use strict>, or it could be a perl builtin, although
the complex data structures don't bode well for this.
=head1 REFERENCES
perldoc Safe
man ulimit