[Python-Dev] Python jails

2011-06-10 Thread Sam Edwards
Hello! This is my first posting to the python-dev list, so please
forgive me if I violate any unspoken etiquette here. :)

I was looking at Python 2.x's f_restricted frame flag (or, rather, the
numerous ways around it) and noticed that most (all?)
of the attacks to escape restricted execution involved the attacker
grabbing something he wasn't supposed to have.
IMO, Python's extensive introspection features make that a losing
battle, since it's simply too easy to forget to blacklist
something and the attacker finding it. Not only that, even with a
perfect vacuum-sealed jail, an attacker can still bring down
the interpreter by exhausting memory or consuming excess CPU.

I think I might have a way of securely sealing-in untrusted code. It's a
fairly nascent idea, though, and I haven't worked out
all of the details yet, so I'm posting what I have so far for feedback
and for others to try to poke holes in it.

Absolutely nothing here is final. I'm just framing out what I generally
had in mind. Obviously, it will need to be adjusted to
be consistent with "the Python way" - my hope is that this can become a
PEP. :)


>>> # It all starts with the introduction of a new type, called a jail.
(I haven't yet worked out whether it should be a builtin type,
... # or a module.) Unjailed code can create jails, which will run the
untrusted code and keep strict limits on it.
...
>>> j = jail()
>>> dir(j)
['__class__', '__delattr__', '__doc__', '__format__',
'__getattribute__', '__hash__',
'__init__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
'__setattr__',
'__sizeof__', '__str__', '__subclasshook__', 'acquire', 'getcpulimit',
'getcpuusage',
'getmemorylimit', 'getmemoryusage', 'gettimelimit', 'gettimeusage',
'release',
'setcpulimit', 'setmemorylimit', 'settimelimit']
>>> # The jail monitors three things: Memory (in bytes), real time (in
seconds), and CPU time (also in seconds)
... # and it also allows you to impose limits on them. If any limit is
non-zero, code in that jail may not exceed its limit.
... # Exceeding a memory limit will result in a MemoryError. I haven't
decided what CPU/real time limits should raise.
... # The other two calls are "acquire" and "release," which allow you
to seal (any) objects inside the jail, or bust them
# out. Objects inside the jail (i.e. created by code in that jail)
contribute their __sizeof__() to the j.getmemoryusage()
...
>>> def stealPasswd():
... return open('/etc/passwd','r').read()
...
>>> j.acquire(stealPasswd)
>>> j.getmemoryusage() # The stealPasswd function, its code, etc. are
now locked away within the jail.
375
>>> stealPasswd()
Traceback (most recent call last):
  File "", line 1, in 
JailError: tried to access an object outside of the jail

The object in question is, of course, 'open'. Unlike the f_restricted
model, the jail was freely able to grab
the open() function, but was absolutely unable to touch it: It can't
call it, set/get/delete attributes/items,
or pass it as an argument to any functions. There are three criteria
that determine whether an object can
be accessed:
a. The code accessing the object is not within a jail; or
b. The object belongs to the same jail as the code accessing the object; or
c. The object has an __access__ function, and
theObject.__access__(theJail) returns True.

For the jail to be able to access 'open', it needs to be given access
explicitly. I haven't quite decided
how this should work, but I had in mind the creation of a "guard"
(essentially a proxy) that allows the jail
to access the object. It belongs to the same jail as the guarded object
(and is therefore impossible to create
within a jail unless the guarded object belongs to the same jail), has a
list of jails (or None for 'any') that the
guard will allow to __access__ it (the guard is immutable, so jails
can't mess with it even though they can
access it), and what the guard will allow though it (read-write,
read-only, call-within-jail, call-outside-jail).

I have a couple remaining issues that I haven't quite sussed out:
* How exactly do guards work? I had in mind a system of proxies (memory
usage is a concern, especially
in memory-limited jails - maybe allow __access__ to return specific
modes of access rather than
all-or-nothing?) that recursively return more guards after
operations. (e.g., if I have a guard allowing
read+call on sys, sys.stdout would return another guard allowing
read+call on sys.stdout, likewise for
sys.stdout.write)
* How are objects sealed in the jail? j.acquire can lead to serious
problems with lots of references
getting recursively sealed in. Maybe disallow sealing in anything
but code objects, or allow explicitly
running code within a jail like j.execute(code, globals(),
locals()), which works fine since any objects
created by jailed code are also jailed.
* How do imports work? Should __import__ be modified so that when a jail
invokes it, the import runs
normally (unjailed), and then returns the module w

Re: [Python-Dev] Python jails

2011-06-10 Thread Sam Edwards
All,

Thanks for the quick responses!

I've skimmed the pysandbox code yesterday. I think Victor has the right
idea with relying on a whitelist, as well as limiting execution time.
The fact that untrusted code can still execute memory exhaustion attacks
is the only thing that still worries me: It's hard to write a server
that will run hundreds of scripts from untrusted users, since one of
them can bring down the entire server by writing an infinite loop that
allocates tons of objects. Python needs a way to hook the
object-allocation process in order to (effectively) limit how much
memory untrusted code can consume.

Tav's blog post makes some interesting points... The object-capability
model definitely has the benefit of efficiency; simply getting the
reference to an object means the untrusted code is trusted with full
capability to that object (which saves having to query the jail every
time the object is touched) - it's just as fast as unrestricted Python,
which I like. Perhaps my jails idea should then be refactored into some
mechanism for monitoring and limiting memory and CPU usage -- it's the
perfect thing to ship as an extension, the only shame is that it
requires interpreter support.
Anyway, in light of Tav's post which seems to suggest that f_restricted
frames are impossible to escape (if used correctly), why was
f_restricted removed in Python 3? Is it simply that it's too easy to
make a mistake and accidentally give an attacker an unsafe object, or is
there some fundamental flaw with it? Could you see something like
f_restricted (or f_jail) getting put back in Python 3, if it were a good
deal more bulletproof?

And, yeah, I've been playing with RestrictedPython. It's pretty good,
but it lacks memory- and CPU-limiting, which is my main focus right now.
And yes, I should probably have posted this to python-ideas, thanks. :)
This is a very long way away from a PEP.

PyPy's sandboxing feature is probably closest to what I'd like, but I'm
looking for something that can coexist in the same process (since
running hundreds of interpreter processes continuously has a lot of
system memory overhead, it's better if the many untrusted, but
independent, jails could share a single interpreter)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com