[Python-Dev] Summary of Python tracker Issues

2011-06-10 Thread Python tracker

ACTIVITY SUMMARY (2011-06-03 - 2011-06-10)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open2826 (+11)
  closed 21268 (+47)
  total  24094 (+58)

Open issues with patches: 1236 


Issues opened (45)
==

#5906: Risk of confusion in multiprocessing module - daemonic process
http://bugs.python.org/issue5906  reopened by pakal

#9516: sysconfig: $MACOSX_DEPLOYMENT_TARGET mismatch: now "10.3" but 
http://bugs.python.org/issue9516  reopened by eric.araujo

#12255: A few changes to .*ignore
http://bugs.python.org/issue12255  opened by eric.araujo

#12256: Link isinstance/issubclass doc to abc module
http://bugs.python.org/issue12256  opened by eric.araujo

#12257: Rework/replace use of DISTUTILS_USE_SDK in packaging
http://bugs.python.org/issue12257  opened by eric.araujo

#12258: Clean up bytes I/O in get_compiler_versions
http://bugs.python.org/issue12258  opened by eric.araujo

#12259: Test and document which compilers can be created on which plat
http://bugs.python.org/issue12259  opened by eric.araujo

#12260: Make install default to user site-packages
http://bugs.python.org/issue12260  opened by eric.araujo

#12261: urllib.parse docs still refer to urlparse
http://bugs.python.org/issue12261  opened by tlesher

#12263: punycode codec ignores the error handler argument
http://bugs.python.org/issue12263  opened by haypo

#12268: file readline, readlines & readall methods can lose data on EI
http://bugs.python.org/issue12268  opened by gregory.p.smith

#12271: Python 2.7.x on IA64 running SLES 11 SP1
http://bugs.python.org/issue12271  opened by v.claudic

#12272: Python 2.7.1 version conflict for package "Tcl" on Windows 7
http://bugs.python.org/issue12272  opened by jackie3

#12273: Change ast.__version__ calculation  to provide consistent order
http://bugs.python.org/issue12273  opened by ncoghlan

#12274: "Print window" menu on IDLE aborts whole application
http://bugs.python.org/issue12274  opened by gagenellina

#12275: urllib.request.HTTPRedirectHandler won't redirect to a URL wit
http://bugs.python.org/issue12275  opened by lilydjwg

#12276: 3.x ignores sys.tracebacklimit=0
http://bugs.python.org/issue12276  opened by gagenellina

#12277: Missing comma in os.walk docs
http://bugs.python.org/issue12277  opened by Retro

#12278: Core mentorship mention in the devguide
http://bugs.python.org/issue12278  opened by adam.woodbeck

#12279: Add build_distinfo command to packaging
http://bugs.python.org/issue12279  opened by eric.araujo

#12281: bytes.decode('mbcs', 'ignore') does replace undecodable bytes 
http://bugs.python.org/issue12281  opened by haypo

#12282: build process adds CWD (null entry) to PYTHONPATH if PYTHONPAT
http://bugs.python.org/issue12282  opened by r.david.murray

#12284: argparse.ArgumentParser: usage example option
http://bugs.python.org/issue12284  opened by jonash

#12285: Unexpected behavior for 0 or negative processes in multiproces
http://bugs.python.org/issue12285  opened by jorgsk

#12287: ossaudiodev: stack corruption with FD >= FD_SETSIZE
http://bugs.python.org/issue12287  opened by neologix

#12288: Python 2.7.1 tkSimpleDialog initialvalue
http://bugs.python.org/issue12288  opened by busfault

#12289: http.server.CGIHTTPRequestHandler doesn't check if a Python sc
http://bugs.python.org/issue12289  opened by haypo

#12290: __setstate__ is called for false values
http://bugs.python.org/issue12290  opened by eltoder

#12291: file written using marshal in 3.2 can be read by 2.7, but not 
http://bugs.python.org/issue12291  opened by vinay.sajip

#12294: multiprocessing.Pool: Need a way to find out if work are finis
http://bugs.python.org/issue12294  opened by mozbugbox

#12295: Fix ResourceWarning in turtledemo help window
http://bugs.python.org/issue12295  opened by eric.araujo

#12296: Minor clarification in devguide
http://bugs.python.org/issue12296  opened by eric.araujo

#12297: Clarifications to atexit.register and unregister doc
http://bugs.python.org/issue12297  opened by eric.araujo

#12299: Stop documenting functions added by site as builtins
http://bugs.python.org/issue12299  opened by eric.araujo

#12300: Document pydoc.help
http://bugs.python.org/issue12300  opened by eric.araujo

#12301: Use :data:`sys.thing` instead of ``sys.thing`` throughout
http://bugs.python.org/issue12301  opened by eric.araujo

#12302: packaging test command needs access to .dist-info
http://bugs.python.org/issue12302  opened by michael.mulich

#12303: expose sigwaitinfo() and sigtimedwait() in the signal module
http://bugs.python.org/issue12303  opened by haypo

#12304: expose signalfd(2) in the signal module
http://bugs.python.org/issue12304  opened by haypo

#12305: Building PEPs doesn't work on Python 3
http://bugs.python.org/issue12305  opened by ericsnow

#12306: zlib: Expose zlibVersion to query runtime version of zlib
http://bugs.py

Re: [Python-Dev] cpython: Remove some extraneous parentheses and swap the comparison order to

2011-06-10 Thread Guido van Rossum
On Wed, Jun 8, 2011 at 8:12 AM, Nick Coghlan  wrote:
> On Wed, Jun 8, 2011 at 7:35 AM, David Malcolm  wrote:
>> After ~12 years of doing this, it comes naturally.  I appreciate that
>> this may come across as weird though :)
>
> I actually thought Brett's rationale in the checkin comment was
> reasonable (if you get in the habit of putting constants on the left,
> then the classic "'=' instead of '=='" typo is a compiler error
> instead of a reassignment).

I really like consistency across the code base. I really don't like
constant-on-the-left, and it's basically not used in the current
codebase. Please be consistent and don't start using it.

> Call it a +0 in favour of letting people put constants on the left in
> C code if they prefer it that way, so long as any given if/elif chain
> is consistent in the style it uses.

Sorry, I give it a -1. (I'd like to be able to read the codebase still... :-)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 3101 implementation vs. documentation

2011-06-10 Thread Ben Wolfson
Hello,

I'm writing because discussion in a bug report I submitted
() has suggested that, insofar as
at least part of the issue revolves around the interpretation of PEP
3101, that aspect belonged on python-dev. In particular, I was told
that the PEP, not the documentation, is authoritative. Since I'm the
one who thinks something is wrong, it seems appropriate for me to be
the one to bring it up.

Basically, the issue is that the current behavior of str.format is at
variance at least with the documentation
, is almost
certainly at variance with PEP3101 in one respect, and is in my
opinion at variance with PEP3101 in another respect as well, regarding
what characters can be present in what the grammar given in the
documentation calls an element_index, that is, the bit between the
square brackets in "{0.attr[idx]}".

Both discovering the current behavior and interpreting the
documentation are pretty straightforward; interpreting what the PEP
actually calls for is more vexed. I'll do the first two things first.
TOC for the remainder:

1. What does the current implementation do?
2. What does the documentation say?
3. What does the PEP say? [this part is long, but the PEP is not
clear, and I wanted to be thorough]
4. Who cares?

1. What does the current implementation do?

Suppose you have this dictionary:

d = {"@": 0,
 "!": 1,
 ":": 2,
 "^": 3,
 "}": 4,
 "{": {"}": 5},
}

Then the following expressions have the following results:

(a) "{0[@]}".format(d)--> '0'
(b) "{0[!]}".format(d)--> ValueError: Missing ']' in format string
(c) "{0[:]}".format(d)--> ValueError: Missing ']' in format string
(d) "{0[^]}".format(d)--> '3'
(e) "{0[}]}".format(d)--> ValueError: Missing ']' in format string
(f) "{0[{]}".format(d)--> ValueError: unmatched '{' in format
(g) "{0[{][}]}".format(d) --> '5'

Given (e) and (f), I think (g) should be a little surprising, though
you can probably guess what's going on and it's not hard to see why it
happens if you look at the source: (e) and (f) fail because
MarkupIterator_next (in Objects/stringlib/string_format.h) scans
through the string looking for curly braces, because it treats them as
semantically significant no matter what context they occur in. So,
according to MarkupIterator_next, the *first* right curly brace in (e)
indicates the end of the replacement field, giving "{0[}". In (f), the
second left curly brace indicates (to MarkupIterator_next) the start
of a *new* replacement field, and since there's only one right curly
brace, it complains. In (g), MarkupIterator_next treats the second
left curly brace as starting a new replacement field and the first
right curly brace as closing it. However, actually, those braces don't
define new replacement fields, as indicated by the fact that the whole
expression treats the element_index fields as just plain old strings.
(So the current implementation is somewhat schizophrenic, acting at
one point as if the braces have to be balanced because '{[]}' is a
replacement field and at another point treating the braces as
insignificant.)

The explanation for (b) and (c) is that parse_field (same source file)
treats ':' and '!'  as indicating the end of the field_name section of
the replacement field, regardless of whether those characters occur
within square brackets or not.

So, that's what the current implementation does, in those cases.

2. What does the documentation say?

The documentation gives a grammar for replacement fields:

"""
replacement_field ::=  "{" [field_name] ["!" conversion] [":" format_spec] "}"
field_name::=  arg_name ("." attribute_name | "[" element_index "]")*
arg_name  ::=  [identifier | integer]
attribute_name::=  identifier
element_index ::=  integer | index_string
index_string  ::=   +
conversion::=  "r" | "s"
format_spec   ::=  
"""

Given this definition of index_string, all of (a)--(g) should be
legal, and the results should be the strings '0', '1', '2', '3',
"{'}': 5}", and '5'. There is no need to exclude ':', '!', '}', or '{'
from the index_string field; allowing them creates no ambiguity,
because the field is delimited by square brackets.

Tangent: the definition of attribute_name here also does not describe
the current behavior ("{0.  ;}".format(x) works fine and will call
getattr(x, " ;")) and the PEP does not require the attribute_name to
be an identifier; in fact it explicitly states that the attribute_name
doesn't need to be a valid Python identifier. attribute_name should
read (to reflect something like actual behavior, anyway) " +". The same goes
for arg_name (with the integer alternation). Observe:

>>> x = lambda: None
>>> setattr(x, ']]', 3)
>>> "{].]]}".format(**{"]":x}) # (h)
'3'

One can also presently do this (hence "something like actual behavior"):
>>> setattr(x, 'f}', 4)
>>> "{a{s.f}}".format(**{"a{s":x})
'4'
But 

[Python-Dev] Python jails

2011-06-10 Thread Sam Edwards
Hello! This is my first posting to the python-dev list, so please
forgive me if I violate any unspoken etiquette here. :)

I was looking at Python 2.x's f_restricted frame flag (or, rather, the
numerous ways around it) and noticed that most (all?)
of the attacks to escape restricted execution involved the attacker
grabbing something he wasn't supposed to have.
IMO, Python's extensive introspection features make that a losing
battle, since it's simply too easy to forget to blacklist
something and the attacker finding it. Not only that, even with a
perfect vacuum-sealed jail, an attacker can still bring down
the interpreter by exhausting memory or consuming excess CPU.

I think I might have a way of securely sealing-in untrusted code. It's a
fairly nascent idea, though, and I haven't worked out
all of the details yet, so I'm posting what I have so far for feedback
and for others to try to poke holes in it.

Absolutely nothing here is final. I'm just framing out what I generally
had in mind. Obviously, it will need to be adjusted to
be consistent with "the Python way" - my hope is that this can become a
PEP. :)


>>> # It all starts with the introduction of a new type, called a jail.
(I haven't yet worked out whether it should be a builtin type,
... # or a module.) Unjailed code can create jails, which will run the
untrusted code and keep strict limits on it.
...
>>> j = jail()
>>> dir(j)
['__class__', '__delattr__', '__doc__', '__format__',
'__getattribute__', '__hash__',
'__init__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
'__setattr__',
'__sizeof__', '__str__', '__subclasshook__', 'acquire', 'getcpulimit',
'getcpuusage',
'getmemorylimit', 'getmemoryusage', 'gettimelimit', 'gettimeusage',
'release',
'setcpulimit', 'setmemorylimit', 'settimelimit']
>>> # The jail monitors three things: Memory (in bytes), real time (in
seconds), and CPU time (also in seconds)
... # and it also allows you to impose limits on them. If any limit is
non-zero, code in that jail may not exceed its limit.
... # Exceeding a memory limit will result in a MemoryError. I haven't
decided what CPU/real time limits should raise.
... # The other two calls are "acquire" and "release," which allow you
to seal (any) objects inside the jail, or bust them
# out. Objects inside the jail (i.e. created by code in that jail)
contribute their __sizeof__() to the j.getmemoryusage()
...
>>> def stealPasswd():
... return open('/etc/passwd','r').read()
...
>>> j.acquire(stealPasswd)
>>> j.getmemoryusage() # The stealPasswd function, its code, etc. are
now locked away within the jail.
375
>>> stealPasswd()
Traceback (most recent call last):
  File "", line 1, in 
JailError: tried to access an object outside of the jail

The object in question is, of course, 'open'. Unlike the f_restricted
model, the jail was freely able to grab
the open() function, but was absolutely unable to touch it: It can't
call it, set/get/delete attributes/items,
or pass it as an argument to any functions. There are three criteria
that determine whether an object can
be accessed:
a. The code accessing the object is not within a jail; or
b. The object belongs to the same jail as the code accessing the object; or
c. The object has an __access__ function, and
theObject.__access__(theJail) returns True.

For the jail to be able to access 'open', it needs to be given access
explicitly. I haven't quite decided
how this should work, but I had in mind the creation of a "guard"
(essentially a proxy) that allows the jail
to access the object. It belongs to the same jail as the guarded object
(and is therefore impossible to create
within a jail unless the guarded object belongs to the same jail), has a
list of jails (or None for 'any') that the
guard will allow to __access__ it (the guard is immutable, so jails
can't mess with it even though they can
access it), and what the guard will allow though it (read-write,
read-only, call-within-jail, call-outside-jail).

I have a couple remaining issues that I haven't quite sussed out:
* How exactly do guards work? I had in mind a system of proxies (memory
usage is a concern, especially
in memory-limited jails - maybe allow __access__ to return specific
modes of access rather than
all-or-nothing?) that recursively return more guards after
operations. (e.g., if I have a guard allowing
read+call on sys, sys.stdout would return another guard allowing
read+call on sys.stdout, likewise for
sys.stdout.write)
* How are objects sealed in the jail? j.acquire can lead to serious
problems with lots of references
getting recursively sealed in. Maybe disallow sealing in anything
but code objects, or allow explicitly
running code within a jail like j.execute(code, globals(),
locals()), which works fine since any objects
created by jailed code are also jailed.
* How do imports work? Should __import__ be modified so that when a jail
invokes it, the import runs
normally (unjailed), and then returns the module w

Re: [Python-Dev] Python jails

2011-06-10 Thread R. David Murray
On Fri, 10 Jun 2011 18:23:47 -0600, Sam Edwards  
wrote:
> Hello! This is my first posting to the python-dev list, so please
> forgive me if I violate any unspoken etiquette here. :)

Well, hopefully we won't bite, though of course I can't promise anything
for anyone else :)

I haven't read through your post, but if you don't know about it I
suspect that you will be interested in the following:

http://code.activestate.com/pypm/pysandbox/

I'm pretty sure Victor will be happy to have someone else interested in
this topic.

--
R. David Murray   http://www.bitdance.com
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python jails

2011-06-10 Thread Guido van Rossum
Hi Sam,

Have you seen this?
http://tav.espians.com/paving-the-way-to-securing-the-python-interpreter.html

It might relate a similar idea. There were a few iterations of Tav's approach.

--Guido

On Fri, Jun 10, 2011 at 5:23 PM, Sam Edwards  wrote:
> Hello! This is my first posting to the python-dev list, so please
> forgive me if I violate any unspoken etiquette here. :)
>
> I was looking at Python 2.x's f_restricted frame flag (or, rather, the
> numerous ways around it) and noticed that most (all?)
> of the attacks to escape restricted execution involved the attacker
> grabbing something he wasn't supposed to have.
> IMO, Python's extensive introspection features make that a losing
> battle, since it's simply too easy to forget to blacklist
> something and the attacker finding it. Not only that, even with a
> perfect vacuum-sealed jail, an attacker can still bring down
> the interpreter by exhausting memory or consuming excess CPU.
>
> I think I might have a way of securely sealing-in untrusted code. It's a
> fairly nascent idea, though, and I haven't worked out
> all of the details yet, so I'm posting what I have so far for feedback
> and for others to try to poke holes in it.
>
> Absolutely nothing here is final. I'm just framing out what I generally
> had in mind. Obviously, it will need to be adjusted to
> be consistent with "the Python way" - my hope is that this can become a
> PEP. :)
>
>
 # It all starts with the introduction of a new type, called a jail.
> (I haven't yet worked out whether it should be a builtin type,
> ... # or a module.) Unjailed code can create jails, which will run the
> untrusted code and keep strict limits on it.
> ...
 j = jail()
 dir(j)
> ['__class__', '__delattr__', '__doc__', '__format__',
> '__getattribute__', '__hash__',
> '__init__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
> '__setattr__',
> '__sizeof__', '__str__', '__subclasshook__', 'acquire', 'getcpulimit',
> 'getcpuusage',
> 'getmemorylimit', 'getmemoryusage', 'gettimelimit', 'gettimeusage',
> 'release',
> 'setcpulimit', 'setmemorylimit', 'settimelimit']
 # The jail monitors three things: Memory (in bytes), real time (in
> seconds), and CPU time (also in seconds)
> ... # and it also allows you to impose limits on them. If any limit is
> non-zero, code in that jail may not exceed its limit.
> ... # Exceeding a memory limit will result in a MemoryError. I haven't
> decided what CPU/real time limits should raise.
> ... # The other two calls are "acquire" and "release," which allow you
> to seal (any) objects inside the jail, or bust them
>    # out. Objects inside the jail (i.e. created by code in that jail)
> contribute their __sizeof__() to the j.getmemoryusage()
> ...
 def stealPasswd():
> ...         return open('/etc/passwd','r').read()
> ...
 j.acquire(stealPasswd)
 j.getmemoryusage() # The stealPasswd function, its code, etc. are
> now locked away within the jail.
> 375
 stealPasswd()
> Traceback (most recent call last):
>  File "", line 1, in 
> JailError: tried to access an object outside of the jail
>
> The object in question is, of course, 'open'. Unlike the f_restricted
> model, the jail was freely able to grab
> the open() function, but was absolutely unable to touch it: It can't
> call it, set/get/delete attributes/items,
> or pass it as an argument to any functions. There are three criteria
> that determine whether an object can
> be accessed:
> a. The code accessing the object is not within a jail; or
> b. The object belongs to the same jail as the code accessing the object; or
> c. The object has an __access__ function, and
> theObject.__access__(theJail) returns True.
>
> For the jail to be able to access 'open', it needs to be given access
> explicitly. I haven't quite decided
> how this should work, but I had in mind the creation of a "guard"
> (essentially a proxy) that allows the jail
> to access the object. It belongs to the same jail as the guarded object
> (and is therefore impossible to create
> within a jail unless the guarded object belongs to the same jail), has a
> list of jails (or None for 'any') that the
> guard will allow to __access__ it (the guard is immutable, so jails
> can't mess with it even though they can
> access it), and what the guard will allow though it (read-write,
> read-only, call-within-jail, call-outside-jail).
>
> I have a couple remaining issues that I haven't quite sussed out:
> * How exactly do guards work? I had in mind a system of proxies (memory
> usage is a concern, especially
>    in memory-limited jails - maybe allow __access__ to return specific
> modes of access rather than
>    all-or-nothing?) that recursively return more guards after
> operations. (e.g., if I have a guard allowing
>    read+call on sys, sys.stdout would return another guard allowing
> read+call on sys.stdout, likewise for
>    sys.stdout.write)
> * How are objects sealed in the jail? j.acquire can lead to serious
> problems with lots of

Re: [Python-Dev] Python jails

2011-06-10 Thread P.J. Eby

At 06:23 PM 6/10/2011 -0600, Sam Edwards wrote:

I have a couple remaining issues that I haven't quite sussed out:
[long list of questions deleted]


You might be able to answer some of them by looking at this project:

  http://pypi.python.org/pypi/RestrictedPython

Which implements the necessary ground machinery for doing that sort 
of thing, in the form of a specialized Python compiler (implemented 
in Python, for 2.3 through 2.7) that allows you to implement whatever 
sorts of guards and security policies you want on top of it.


Even if it doesn't answer all your questions in and of itself, it may 
prove a fruitful environment in which you can experiment with various 
approaches and see which ones you actually like, without first having 
to write a bunch of code yourself.


Discussing an official implementation of this sort of thing as a 
language feature is probably best left to python-ideas, though, until 
and unless you actually have a PEP to propose.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python jails

2011-06-10 Thread Sam Edwards
All,

Thanks for the quick responses!

I've skimmed the pysandbox code yesterday. I think Victor has the right
idea with relying on a whitelist, as well as limiting execution time.
The fact that untrusted code can still execute memory exhaustion attacks
is the only thing that still worries me: It's hard to write a server
that will run hundreds of scripts from untrusted users, since one of
them can bring down the entire server by writing an infinite loop that
allocates tons of objects. Python needs a way to hook the
object-allocation process in order to (effectively) limit how much
memory untrusted code can consume.

Tav's blog post makes some interesting points... The object-capability
model definitely has the benefit of efficiency; simply getting the
reference to an object means the untrusted code is trusted with full
capability to that object (which saves having to query the jail every
time the object is touched) - it's just as fast as unrestricted Python,
which I like. Perhaps my jails idea should then be refactored into some
mechanism for monitoring and limiting memory and CPU usage -- it's the
perfect thing to ship as an extension, the only shame is that it
requires interpreter support.
Anyway, in light of Tav's post which seems to suggest that f_restricted
frames are impossible to escape (if used correctly), why was
f_restricted removed in Python 3? Is it simply that it's too easy to
make a mistake and accidentally give an attacker an unsafe object, or is
there some fundamental flaw with it? Could you see something like
f_restricted (or f_jail) getting put back in Python 3, if it were a good
deal more bulletproof?

And, yeah, I've been playing with RestrictedPython. It's pretty good,
but it lacks memory- and CPU-limiting, which is my main focus right now.
And yes, I should probably have posted this to python-ideas, thanks. :)
This is a very long way away from a PEP.

PyPy's sandboxing feature is probably closest to what I'd like, but I'm
looking for something that can coexist in the same process (since
running hundreds of interpreter processes continuously has a lot of
system memory overhead, it's better if the many untrusted, but
independent, jails could share a single interpreter)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com