Re: [Python-Dev] cpython: Use cached builtins.

2013-10-03 Thread Georg Brandl
Am 02.10.2013 21:58, schrieb Victor Stinner:
> I don't remember where, but I remember that I also saw things like
> "str=str, len=len, ...". So you keep the same name, but you use fast
> local lookups instead of slow builtin lookups.

In this case they aren't even fast local lookups but (slightly)
faster module global lookups.  Not worth the effort IMO.

Georg

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Use cached builtins.

2013-10-03 Thread Nick Coghlan
On 3 Oct 2013 06:00, "Victor Stinner"  wrote:
>
> I don't remember where, but I remember that I also saw things like
> "str=str, len=len, ...". So you keep the same name, but you use fast
> local lookups instead of slow builtin lookups.

functools uses the local binding trick in lru_cache as a speed hack (pretty
sure it uses an underscore prefix, though).

However lru_cache *is* likely to end up being speed critical *and* it's
binding local variables , so it's actually shifting a lot more work to
compile time than merely trading a builtin lookup for a global lookup does.

For most code though, introducing that kind of complexity isn't worth the
cost in readability.

Cheers,
Nick.

>
> Victor
>
> 2013/10/2 Antoine Pitrou :
> > On Wed,  2 Oct 2013 18:16:48 +0200 (CEST)
> > serhiy.storchaka  wrote:
> >> http://hg.python.org/cpython/rev/d48ac94e365f
> >> changeset:   85931:d48ac94e365f
> >> user:Serhiy Storchaka 
> >> date:Wed Oct 02 19:15:54 2013 +0300
> >> summary:
> >>   Use cached builtins.
> >
> > What's the point? I don't think it's a good idea to uglify the code if
> > there isn't a clear benefit.
> >
> > Regards
> >
> > Antoine.
> >
> >
> > ___
> > Python-Dev mailing list
> > [email protected]
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Use cached builtins.

2013-10-03 Thread Michael Foord

On 3 Oct 2013, at 12:05, Nick Coghlan  wrote:

> 
> On 3 Oct 2013 06:00, "Victor Stinner"  wrote:
> >
> > I don't remember where, but I remember that I also saw things like
> > "str=str, len=len, ...". So you keep the same name, but you use fast
> > local lookups instead of slow builtin lookups.
> 
> functools uses the local binding trick in lru_cache as a speed hack (pretty 
> sure it uses an underscore prefix, though).
> 


Inside a function you *have* to use an underscore prefix (well, some alternate 
name anyway) - otherwise the assignment makes the name local and the lookup 
would fail with an unbound local error. In order to do the binding at 
definition time rather than call time (so only once) the assignment is often 
done in the function signature. This is *particularly* ugly as it not only 
messes up the using of the builtins but screws your function signature too. So 
It should really only be done where benchmarking proves it makes a difference. 
I've never managed to find such a case...

Michael

> However lru_cache *is* likely to end up being speed critical *and* it's 
> binding local variables , so it's actually shifting a lot more work to 
> compile time than merely trading a builtin lookup for a global lookup does.
> 
> For most code though, introducing that kind of complexity isn't worth the 
> cost in readability.
> 
> Cheers,
> Nick.
> 
> >
> > Victor
> >
> > 2013/10/2 Antoine Pitrou :
> > > On Wed,  2 Oct 2013 18:16:48 +0200 (CEST)
> > > serhiy.storchaka  wrote:
> > >> http://hg.python.org/cpython/rev/d48ac94e365f
> > >> changeset:   85931:d48ac94e365f
> > >> user:Serhiy Storchaka 
> > >> date:Wed Oct 02 19:15:54 2013 +0300
> > >> summary:
> > >>   Use cached builtins.
> > >
> > > What's the point? I don't think it's a good idea to uglify the code if
> > > there isn't a clear benefit.
> > >
> > > Regards
> > >
> > > Antoine.
> > >
> > >
> > > ___
> > > Python-Dev mailing list
> > > [email protected]
> > > https://mail.python.org/mailman/listinfo/python-dev
> > > Unsubscribe: 
> > > https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
> > ___
> > Python-Dev mailing list
> > [email protected]
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe: 
> > https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk


--
http://www.voidspace.org.uk/


May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing 
http://www.sqlite.org/different.html





___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] summing integer and class

2013-10-03 Thread Igor Vasilyev

Hi.

Example test.py:

class A():
def __add__(self, var):
print("I'm in A class")
return 5
a = A()
a+1
1+a

Execution:
python test.py
I'm in A class
Traceback (most recent call last):
  File "../../test.py", line 7, in 
1+a
TypeError: unsupported operand type(s) for +: 'int' and 'instance'


So adding integer to class works fine, but adding class to integer fails.
I could not understand why it happens. In objects/abstact.c we have the 
following function:


static PyObject *
binary_op1(PyObject *v, PyObject *w, const int op_slot)
{
PyObject *x;
binaryfunc slotv = NULL;
binaryfunc slotw = NULL;

if (v->ob_type->tp_as_number != NULL)
slotv = NB_BINOP(v->ob_type->tp_as_number, op_slot);
if (w->ob_type != v->ob_type &&
w->ob_type->tp_as_number != NULL) {
slotw = NB_BINOP(w->ob_type->tp_as_number, op_slot);
if (slotw == slotv)
slotw = NULL;
}
if (slotv) {
if (slotw && PyType_IsSubtype(w->ob_type, v->ob_type)) {
x = slotw(v, w);
if (x != Py_NotImplemented)
return x;
Py_DECREF(x); /* can't do it */
slotw = NULL;
}
x = slotv(v, w);
if (x != Py_NotImplemented)
return x;
Py_DECREF(x); /* can't do it */
}
if (slotw) {
x = slotw(v, w);
if (x != Py_NotImplemented)
return x;
Py_DECREF(x); /* can't do it */
}
Py_RETURN_NOTIMPLEMENTED;
}

When we adding class to integer we have both slotv and slotw. x = 
slotv(v, w); -> returns Py_NotImplemented.
But in this case we should execute x = slotw(v, w); and function should 
be completed in the same way as when we adding integer to class.


Can someone advise please where I mistake.

--
thanks,
Igor Vasilyev
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] summing integer and class

2013-10-03 Thread Игорь Васильев
 Hi.


Example test.py:

class A():
    def __add__(self, var):
    print("I'm in A class")
    return 5
a = A()
a+1
1+a

Execution:
python test.py
I'm in A class
Traceback (most recent call last):
  File "../../test.py", line 7, in 
    1+a
TypeError: unsupported operand type(s) for +: 'int' and 'instance'


So adding integer to class works fine, but adding class to integer fails.
I could not understand why it happens. In objects/abstact.c we have the 
following function:

static PyObject *
binary_op1(PyObject *v, PyObject *w, const int op_slot)
{
    PyObject *x;
    binaryfunc slotv = NULL;
    binaryfunc slotw = NULL;

    if (v->ob_type->tp_as_number != NULL)
    slotv = NB_BINOP(v->ob_type->tp_as_number, op_slot);
    if (w->ob_type != v->ob_type &&
    w->ob_type->tp_as_number != NULL) {
    slotw = NB_BINOP(w->ob_type->tp_as_number, op_slot);
    if (slotw == slotv)
    slotw = NULL;
    }
    if (slotv) {
    if (slotw && PyType_IsSubtype(w->ob_type, v->ob_type)) {
    x = slotw(v, w);
    if (x != Py_NotImplemented)
    return x;
    Py_DECREF(x); /* can't do it */
    slotw = NULL;
    }
    x = slotv(v, w);
    if (x != Py_NotImplemented)
    return x;
    Py_DECREF(x); /* can't do it */
    }
    if (slotw) {
    x = slotw(v, w);
    if (x != Py_NotImplemented)
    return x;
    Py_DECREF(x); /* can't do it */
    }
    Py_RETURN_NOTIMPLEMENTED;
}

When we adding class to integer we have both slotv and slotw. x = slotv(v, w); 
-> returns Py_NotImplemented.
But in this case we should execute x = slotw(v, w); and function should be 
completed in the same way as when we adding integer to class. 

Can someone advise please where I mistake. 

-- 
thanks,
Igor Vasilyev
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 454: Add a new tracemalloc module (final version)

2013-10-03 Thread Victor Stinner
Hi,

I worked on the implementation of the tracemalloc module and its PEP
454. I consider that this third version of the PEP is ready for a
final review.

What should be done to finish the PEP?

HTML version of the PEP 454:
http://www.python.org/dev/peps/pep-0454/

Full documentation of the tracemalloc module, with examples, a short
tutorial and the documentation of the command line (python -m
tracemalloc):
http://www.haypocalc.com/tmp/tracemalloc/library/tracemalloc.html



I addressed all points of my own TODO list, even minor tasks. I still
have some minor tasks, but only on the implementation, not on the API
of the module. The API is designed to process easily data collected by
tracemalloc and to be extensible. It should now be easy to write a GUI
to display these data for example.


API changes between versions 2 and 3 of the PEP:

* Add Metric class: in version 2, a snapshot had an "user_data"
attribute which was not defined, not even its type. Metrics are now
formalized, displayed and can be compared. They are used to track the
process memory, number of Python objects, size of memory traced by
tracemalloc, etc. The API also allows you to add your own metrics.

* Rewrite the "scheduler" API. I removed the restriction of a single
task at the same time. tracemalloc now supports multiple tasks which
have a new trigger: threshold on the traced memory. I added a new Task
class. DisplayTopTask and TakeSnapshotTask classes inherit from Task,
to inherit its methods and attributes. It is now possible to only
schedule a task "repeat" times.

* Add DisplayTop.display() method, more convinient than having to
create a DisplayTopTask instance.

* Remove Frame, Trace and TraceStats classes for efficiency.
tracemalloc handles millions of such objects. Creating and
serialization millions of such small objects could take up to 30
seconds. I was not really convinced that a class is required to store
only two fields, a tuple is just fine for a debug module.

* (and many other minor changes)

I optimized the implementation which is now faster and releases the
memory earlier thanks to a new memory pool (only used by the module).


PEP: 454
Title: Add a new tracemalloc module to trace Python memory allocations
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 3-September-2013
Python-Version: 3.4


Abstract


Add a new ``tracemalloc`` module to trace memory blocks allocated by
Python.



Rationale
=

Common debug tools tracing memory allocations read the C filename and
line number.  Using such tool to analyze Python memory allocations does
not help because most memory block are allocated in the same C function,
in ``PyMem_Malloc()`` for example.

There are debug tools dedicated to the Python language like ``Heapy``
and ``PySizer``. These tools analyze objects type and/or content.  They
are useful when most memory leaks are instances of the same type and
this type is only instantiated in a few functions. The problem is when
the object type is very common like ``str`` or ``tuple``, and it is hard
to identify where these objects are instantiated.

Finding reference cycles is also a difficult problem. There are
different tools to draw a diagram of all references. These tools cannot
be used on large applications with thousands of objects because the
diagram is too huge to be analyzed manually.


Proposal


Using the PEP 445, it becomes easy to setup an hook on Python memory
allocators. A hook can inspect Python internals to retrieve the Python
tracebacks.

This PEP proposes to add a new ``tracemalloc`` module. It is a debug
tool to trace memory blocks allocated by Python. The module provides the
following information:

* Compute the differences between two snapshots to detect memory leaks
* Statistics on allocated memory blocks per filename and per line
  number: total size, number and average size of allocated memory blocks
* Traceback where a memory block was allocated

The API of the tracemalloc module is similar to the API of the
faulthandler module: ``enable()``, ``disable()`` and ``is_enabled()``
functions, an environment variable (``PYTHONFAULTHANDLER`` and
``PYTHONTRACEMALLOC``), a ``-X`` command line option (``-X
faulthandler`` and ``-X tracemalloc``). See the
`documentation of the faulthandler module
`_.

The tracemalloc module has been written for CPython. Other
implementations of Python may not provide it.


API
===

To trace most memory blocks allocated by Python, the module should be
enabled as early as possible by setting the ``PYTHONTRACEMALLOC``
environment variable to ``1``, or by using ``-X tracemalloc`` command
line option. The ``tracemalloc.enable()`` function can also be called to
start tracing Python memory allocations.

By default, a trace of an allocated memory block only stores one frame.
Use the ``set_traceback_limit()`` function to store mo

Re: [Python-Dev] summing integer and class

2013-10-03 Thread Chris Angelico
On Thu, Oct 3, 2013 at 11:09 PM, Игорь Васильев  wrote:
> When we adding class to integer we have both slotv and slotw. x = slotv(v,
> w); -> returns Py_NotImplemented.
> But in this case we should execute x = slotw(v, w); and function should be
> completed in the same way as when we adding integer to class.
>
> Can someone advise please where I mistake.

No need to dig into the CPython source for this, the answer's pretty
simple: 1+a is handled by __radd__ not __add__.

>>> class A():
def __add__(self, var):
print("I'm in A class")
return 5
def __radd__(self, var):
print("I'm in A class, too")
return 6

>>> a=A()
>>> a+1
I'm in A class
5
>>> 1+a
I'm in A class, too
6

You could ask this sort of thing on [email protected] rather than
python-dev.

ChrisA
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] summing integer and class

2013-10-03 Thread Xavier Morel

On 2013-10-03, at 15:45 , Igor Vasilyev wrote:

> Hi.
> 
> Example test.py:
> 
> class A():
>def __add__(self, var):
>print("I'm in A class")
>return 5
> a = A()
> a+1
> 1+a
> 
> Execution:
> python test.py
> I'm in A class
> Traceback (most recent call last):
>  File "../../test.py", line 7, in 
>1+a
> TypeError: unsupported operand type(s) for +: 'int' and 'instance'
> 
> 
> So adding integer to class works fine, but adding class to integer fails.
> I could not understand why it happens. In objects/abstact.c we have the 
> following function:
> 

python-dev is about developing Python itself, not about developing in
Python, so that's the wrong mailing list for these kinds of question.

But FWIW the answer is that Python first tries 1.__add__(a), when that
fails (with NotImplemented) it uses the reflected method[0] which is
a.__radd__(1). Since that does not exist, the operation is invalid.

[0] http://docs.python.org/2/reference/datamodel.html#object.__radd__
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] summing integer and class

2013-10-03 Thread Chris Kaynor
This list is for development OF Python, not for development in python. For
that reason, I will redirect this to python-list as well. My actual answer
is below.

On Thu, Oct 3, 2013 at 6:45 AM, Igor Vasilyev 
 wrote:

> Hi.
>
> Example test.py:
>
> class A():
> def __add__(self, var):
> print("I'm in A class")
> return 5
> a = A()
> a+1
> 1+a


> Execution:
> python test.py
> I'm in A class
> Traceback (most recent call last):
>   File "../../test.py", line 7, in 
> 1+a
> TypeError: unsupported operand type(s) for +: 'int' and 'instance'
>
>
> So adding integer to class works fine, but adding class to integer fails.
> I could not understand why it happens. In objects/abstact.c we have the
> following function:
>

Based on the code you provided, you are only overloading the __add__
operator, which is only called when an "A" is added to something else, not
when something is added to an "A". You can also override the __radd__
method to perform the swapped addition. See
http://docs.python.org/2/reference/datamodel.html#object.__radd__ for the
documentation (it is just below the entry on __add__).

Note that for many simple cases, you could define just a single function,
which then is defined as both the __add__ and __radd__ operator. For
example, you could modify your "A" sample class to look like:

class A():
def __add__(self, var):
print("I'm in A")
return 5
__radd__ = __add__


Which will produce:
>>> a = A()
>>> a + 1
I'm in A
5
>>> 1 + a
I'm in A
5

Chris
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Christian Heimes
Hi,

some of you may have seen that I'm working on a PEP for a new hash API
and new algorithms for hashing of bytes and str. The PEP has three major
aspects. It introduces DJB's SipHash as secure hash algorithm, chances
the hash API to process blocks of data instead characters and it adds an
API to make the algorithm pluggable. A draft is already available [1].

Now I got some negative feedback on the 'pluggable' aspect of the new
PEP on Twitter [2]. I like to get feedback from you before I finalize
the PEP.

The PEP proposes a pluggable hash API for a couple of reasons. I like to
give users of Python a chance to replace a secure hash algorithm with a
faster hash algorithm. SipHash is about as fast as FNV for common cases
as our implementation of FNV process only 8 to 32 bits per cycle instead
of 32 or 64. I haven't actually benchmarked how a faster hash algorithm
affects the a real program, though ...

I also like to make it easier to replace the hash algorithm with a
different one in case a vulnerability is found. With the new API vendors
and embedders have an easy and clean way to use their own hash
implementation or an optimized version that is more suitable for their
platform, too. For example a mobile phone vendor could provide an
optimized implementation with ARM NEON intrinsics.


On which level should Python support a pluggable hash algorithm?

1) Compile time option: The hash code is compiled into Python's core.
Embedders have to recompile Python with different options to replace the
function.

2) Library option: A hash algorithm can be added and one avaible hash
algorithm can be set before Py_Initialize() is called for the first
time. The approach gives embedders the chance the set their own
algorithm without recompiling Python.

3) Startup options: Like 2) plus an additional environment variable and
command line argument to select an algorithm. With a startup option
users can select a different algorithm themselves.

Christian

[1] http://www.python.org/dev/peps/pep-0456/
[2] https://twitter.com/EDEADLK/status/385572395777818624

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Antoine Pitrou
On Thu, 03 Oct 2013 20:42:28 +0200
Christian Heimes  wrote:
> 
> I haven't actually benchmarked how a faster hash algorithm
> affects the a real program, though ...

Chances are it doesn't. Only a "slow enough" hash algorithm might have
an impact, IMHO.

> On which level should Python support a pluggable hash algorithm?
> 
> 1) Compile time option: The hash code is compiled into Python's core.
> Embedders have to recompile Python with different options to replace the
> function.

Not much point IMHO. Embedders can patch Python if they really need
this.

> 2) Library option: A hash algorithm can be added and one avaible hash
> algorithm can be set before Py_Initialize() is called for the first
> time.

Too complicated. The library option should only offer the option to
replace the hash algorithm, not "add an available algorithm".

> 3) Startup options: Like 2) plus an additional environment variable and
> command line argument to select an algorithm. With a startup option
> users can select a different algorithm themselves.

-0.9. I think it's overkill.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Guido van Rossum
Hm. I would like to stick to the philosophy that Python's hash should be as
fast as it possibly can be, and should not be mistaken for a cryptographic
hash. The point is to optimize dict lookups, nothing more, given typical
(or even atypical) key distribution, not to thwart deliberate attacks. We
already have adopted a feature that plugged most viable attacks on web
apps, I think that's enough. I also agree with Antoine's response.


On Thu, Oct 3, 2013 at 11:42 AM, Christian Heimes wrote:

> Hi,
>
> some of you may have seen that I'm working on a PEP for a new hash API
> and new algorithms for hashing of bytes and str. The PEP has three major
> aspects. It introduces DJB's SipHash as secure hash algorithm, chances
> the hash API to process blocks of data instead characters and it adds an
> API to make the algorithm pluggable. A draft is already available [1].
>
> Now I got some negative feedback on the 'pluggable' aspect of the new
> PEP on Twitter [2]. I like to get feedback from you before I finalize
> the PEP.
>
> The PEP proposes a pluggable hash API for a couple of reasons. I like to
> give users of Python a chance to replace a secure hash algorithm with a
> faster hash algorithm. SipHash is about as fast as FNV for common cases
> as our implementation of FNV process only 8 to 32 bits per cycle instead
> of 32 or 64. I haven't actually benchmarked how a faster hash algorithm
> affects the a real program, though ...
>
> I also like to make it easier to replace the hash algorithm with a
> different one in case a vulnerability is found. With the new API vendors
> and embedders have an easy and clean way to use their own hash
> implementation or an optimized version that is more suitable for their
> platform, too. For example a mobile phone vendor could provide an
> optimized implementation with ARM NEON intrinsics.
>
>
> On which level should Python support a pluggable hash algorithm?
>
> 1) Compile time option: The hash code is compiled into Python's core.
> Embedders have to recompile Python with different options to replace the
> function.
>
> 2) Library option: A hash algorithm can be added and one avaible hash
> algorithm can be set before Py_Initialize() is called for the first
> time. The approach gives embedders the chance the set their own
> algorithm without recompiling Python.
>
> 3) Startup options: Like 2) plus an additional environment variable and
> command line argument to select an algorithm. With a startup option
> users can select a different algorithm themselves.
>
> Christian
>
> [1] http://www.python.org/dev/peps/pep-0456/
> [2] https://twitter.com/EDEADLK/status/385572395777818624
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Christian Heimes
Am 03.10.2013 21:05, schrieb Guido van Rossum:
> Hm. I would like to stick to the philosophy that Python's hash
> should be as fast as it possibly can be, and should not be mistaken
> for a cryptographic hash. The point is to optimize dict lookups,
> nothing more, given typical (or even atypical) key distribution,
> not to thwart deliberate attacks. We already have adopted a feature
> that plugged most viable attacks on web apps, I think that's
> enough. I also agree with Antoine's response.

Python's hash is neither as fast nor as secure as it can possibly be.

It's not as fast because it doesn't use the full power of modern CPUs.
In most cases the code processes only 1 or 2 bytes per cycle instead
of 8 bytes on 64-bit architectures. Jean-Philippe Aumasson and Daniel
J. Bernstein (who are coincidentally the authors of SipHash) have
shown how to recover Python randomization keys.

SipHash:
  more secure and about same speed on most systems
optimized FNV:
  faster but with a known issue

Christian
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Guido van Rossum
On Thu, Oct 3, 2013 at 12:23 PM, Christian Heimes wrote:

> Am 03.10.2013 21:05, schrieb Guido van Rossum:
> > Hm. I would like to stick to the philosophy that Python's hash
> > should be as fast as it possibly can be, and should not be mistaken
> > for a cryptographic hash. The point is to optimize dict lookups,
> > nothing more, given typical (or even atypical) key distribution,
> > not to thwart deliberate attacks. We already have adopted a feature
> > that plugged most viable attacks on web apps, I think that's
> > enough. I also agree with Antoine's response.
>
> Python's hash is neither as fast nor as secure as it can possibly be.
>

But fixing that shouldn't need all the extra stuff you're proposing.

It's not as fast because it doesn't use the full power of modern CPUs.
> In most cases the code processes only 1 or 2 bytes per cycle instead
> of 8 bytes on 64-bit architectures. Jean-Philippe Aumasson and Daniel
> J. Bernstein (who are coincidentally the authors of SipHash) have
> shown how to recover Python randomization keys.
>

What's a Python randomization key?


> SipHash:
>   more secure and about same speed on most systems
>

Same speed as what?


> optimized FNV:
>   faster but with a known issue
>

What issue?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 456

2013-10-03 Thread Serhiy Storchaka

Just some comments.

> the first time time with a bit shift of 7

Double "time".

> with a 128bit seed and 64-bit output

Inconsistancy with hyphen. There are same issues in other places.

> bytes_hash provides the tp_hash slot function for unicode.

Typo. Should be "unicode_hash".

> len = PyUnicode_GET_LENGTH(self);
> switch (PyUnicode_KIND(self)) {
> case PyUnicode_1BYTE_KIND: {
> const Py_UCS1 *c = PyUnicode_1BYTE_DATA(self);
> x = _PyHash_Func->hashfunc(c, len * sizeof(Py_UCS1));
> break;
> }
> case PyUnicode_2BYTE_KIND: {
...

x = _PyHash_Func->hashfunc(PyUnicode_BYTE_DATA(self), 
PyUnicode_GET_LENGTH(self) * PyUnicode_KIND(self));


> Equal hash values result in a hash collision and therefore cause a 
minor speed penalty for dicts and sets with mixed keys. The cause of the 
collision could be removed


I doubt about this. If one collects bytes and strings in one dictionary, 
this equality will only double the number of collisions (for DoS attack 
we need increase it by thousands and millions times). So it doesn't 
matter. On the other hand, I one deliberately uses bytes and str 
subclasses with overridden equality, same hash for ASCII bytes and 
strings can be needed.


> For very short strings the setup costs for SipHash dominates its 
speed but it is still in the same order of magnitude as the current FNV 
code.


We could use other algorithm for very short strings if it makes matter.

> The summarized total runtime of the benchmark is within 1% of the 
runtime of an unmodified Python 3.4 binary.


What about deviations of individual tests?


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Christian Heimes
Am 03.10.2013 21:45, schrieb Guido van Rossum:
> But fixing that shouldn't need all the extra stuff you're
> proposing.

I have proposed some of the extra stuff for more flexibility, the rest
is for testing and debugging.

> What's a Python randomization key?

Python's hash randomization key, the seed to randomize the output of
hash() for bytes and str.

> SipHash: more secure and about same speed on most systems
> 
> Same speed as what?

Same speed as the current algorithm in Python 3.3 and earlier.

> optimized FNV: faster but with a known issue
> 
> What issue?

Quote from https://131002.net/siphash/#at
---
  Jointly with Martin Boßlet, we demonstrated weaknesses in MurmurHash
(used in Ruby, Java, etc.), CityHash (used in Google), and in Python's
hash. Some of the technologies affected have switched to SipHash. See
this oCERT advisory, and the following resources:

  [...]

  - Python script https://131002.net/siphash/poc.py to recover
the secret seed of the hash randomization in Python 2.7.3 and
3.2.3
---

It's all documented in my PEP draft, too.

Christian






___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: [issue19151] Fix issue number in Misc/NEWS entry.

2013-10-03 Thread A.M. Kuchling
On Thu, Oct 03, 2013 at 08:48:47PM +0200, eric.snow wrote:
> -- Issue #19951: Fix docstring and use of _get_suppported_file_loaders() to
> +- Issue #19151: Fix docstring and use of _get_suppported_file_loaders() to
   ^^^ likely a typo?

--amk
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Victor Stinner
2013/10/3 Christian Heimes :
> A hash algorithm can be added and one avaible hash
> algorithm can be set before Py_Initialize() is called for the first
> time.

"Py_Initialize" is not the good guard. Try for example "python3 -X
faulthandler": PyObject_Hash() is called before Py_Initialize() to add
"faulthandler" key into sys._xoptions dictionary.

Today many Python internal functions are used before Python is
initialized... See the PEP 432 which proposes to improve the
situation:
http://www.python.org/dev/peps/pep-0432/

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: [issue19151] Fix issue number in Misc/NEWS entry.

2013-10-03 Thread Eric Snow
On Thu, Oct 3, 2013 at 1:57 PM, A.M. Kuchling  wrote:
> On Thu, Oct 03, 2013 at 08:48:47PM +0200, eric.snow wrote:
>> -- Issue #19951: Fix docstring and use of _get_suppported_file_loaders() to
>> +- Issue #19151: Fix docstring and use of _get_suppported_file_loaders() to
>^^^ likely a typo?

It's just awkward phrasing.  I suppose it would be more clear as "Fix
the docstring of _get_supported_file_loaders() and its use to...".  If
you think it's worth fixing, I'll fix it.

-eric
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: [issue19151] Fix issue number in Misc/NEWS entry.

2013-10-03 Thread Barry Warsaw
On Oct 03, 2013, at 02:08 PM, Eric Snow wrote:

>On Thu, Oct 3, 2013 at 1:57 PM, A.M. Kuchling  wrote:
>> On Thu, Oct 03, 2013 at 08:48:47PM +0200, eric.snow wrote:
>>> -- Issue #19951: Fix docstring and use of _get_suppported_file_loaders() to
>>> +- Issue #19151: Fix docstring and use of _get_suppported_file_loaders() to
>
>It's just awkward phrasing.  I suppose it would be more clear as "Fix
>the docstring of _get_supported_file_loaders() and its use to...".  If
>you think it's worth fixing, I'll fix it.

PPProbably not the typppo Andrew was pppointing out.

-Bary
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: [issue19151] Fix issue number in Misc/NEWS entry.

2013-10-03 Thread Eric Snow
On Thu, Oct 3, 2013 at 2:21 PM, Barry Warsaw  wrote:
> PPProbably not the typppo Andrew was pppointing out.
>
> -Bary

Ohhh, that typppo.

-eric
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Guido van Rossum
On Thu, Oct 3, 2013 at 12:55 PM, Christian Heimes wrote:

> Am 03.10.2013 21:45, schrieb Guido van Rossum:
> > But fixing that shouldn't need all the extra stuff you're
> > proposing.
>
> I have proposed some of the extra stuff for more flexibility, the rest
> is for testing and debugging.
>

Hm, I don't think we need more infrastructure for this. As Antoine said, if
you're hacking on this you might as well edit the source.


> > What's a Python randomization key?
>
> Python's hash randomization key, the seed to randomize the output of
> hash() for bytes and str.
>

Is the seed itself crypto-safe? (I.e. is it derived carefully from urandom?)


>  > SipHash: more secure and about same speed on most systems
> >
> > Same speed as what?
>
> Same speed as the current algorithm in Python 3.3 and earlier.
>

OK, then I have no objection to switching to it, *if* the security issue is
really worth fixing. Otherwise it would be better to look for a hash that
is *faster*, given your assertion that the current hash is inefficient.


> > optimized FNV: faster but with a known issue
> >
> > What issue?
>
> Quote from https://131002.net/siphash/#at
> ---
>   Jointly with Martin Boßlet, we demonstrated weaknesses in MurmurHash
> (used in Ruby, Java, etc.), CityHash (used in Google), and in Python's
> hash. Some of the technologies affected have switched to SipHash. See
> this oCERT advisory, and the following resources:
>
>   [...]
>
>   - Python script https://131002.net/siphash/poc.py to recover
> the secret seed of the hash randomization in Python 2.7.3 and
> 3.2.3
>

Sounds a bit like some security researchers drumming up business. If you
can run the binary, presumably you can also recover the seed by looking in
/proc, right? Or use ctypes or something. This demonstration seems of
academic interest only.


> ---
>
> It's all documented in my PEP draft, too.


Yeah, there's lots of stuff there. I'm looking for the TL;DR version. :-)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 456

2013-10-03 Thread Christian Heimes
Am 03.10.2013 21:53, schrieb Serhiy Storchaka:
>> the first time time with a bit shift of 7
> 
> Double "time".

thx, fixed

>> with a 128bit seed and 64-bit output
> 
> Inconsistancy with hyphen. There are same issues in other places.

I have unified the use of hyphens, thx!

>> bytes_hash provides the tp_hash slot function for unicode.
> 
> Typo. Should be "unicode_hash".

Fixed

> x = _PyHash_Func->hashfunc(PyUnicode_BYTE_DATA(self),
> PyUnicode_GET_LENGTH(self) * PyUnicode_KIND(self));

Oh nice, that's easier to read. It's PyUnicode_DATA().

> I doubt about this. If one collects bytes and strings in one dictionary,
> this equality will only double the number of collisions (for DoS attack
> we need increase it by thousands and millions times). So it doesn't
> matter. On the other hand, I one deliberately uses bytes and str
> subclasses with overridden equality, same hash for ASCII bytes and
> strings can be needed.

It's not a big problem. I merely wanted to point out that there is a
simple possibility for a minor optimization. That's all. :)

>> For very short strings the setup costs for SipHash dominates its speed
> but it is still in the same order of magnitude as the current FNV code.
> 
> We could use other algorithm for very short strings if it makes matter.

I though of that, too. The threshold is rather small, though. As far as
I remember an effective hash collision DoS works with 7 or 8 chars.

>> The summarized total runtime of the benchmark is within 1% of the
> runtime of an unmodified Python 3.4 binary.
> 
> What about deviations of individual tests?

Here you go.

http://pastebin.com/dKdnBCgb
http://pastebin.com/wtfUS5Zz

Christian
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Nick Coghlan
On 4 Oct 2013 06:08, "Victor Stinner"  wrote:
>
> 2013/10/3 Christian Heimes :
> > A hash algorithm can be added and one avaible hash
> > algorithm can be set before Py_Initialize() is called for the first
> > time.
>
> "Py_Initialize" is not the good guard. Try for example "python3 -X
> faulthandler": PyObject_Hash() is called before Py_Initialize() to add
> "faulthandler" key into sys._xoptions dictionary.
>
> Today many Python internal functions are used before Python is
> initialized... See the PEP 432 which proposes to improve the
> situation:
> http://www.python.org/dev/peps/pep-0432/

That problem exists because our main function doesn't follow the C API
usage rules, though. We require other embedding applications to be better
behaved than that if they want support :)

That said, while I'm mostly in favour of the PEP, I think setting the
algorithm should be a private API for 3.4.

I do agree that since the platform support for SipHash is slightly
narrower,  we need to keep the existing hash algorithm around, make it
relatively easy to enable and ensure we continue to test it on the build
bots.

I believe that last requirement for buildbot testing is the one that should
drive the design of the private configuration API.

Cheers,
Nick.

>
> Victor
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Guido van Rossum
On Thu, Oct 3, 2013 at 2:13 PM, Nick Coghlan  wrote:

> On 4 Oct 2013 06:08, "Victor Stinner"  wrote:
> >
> > 2013/10/3 Christian Heimes :
> > > A hash algorithm can be added and one avaible hash
> > > algorithm can be set before Py_Initialize() is called for the first
> > > time.
> >
> > "Py_Initialize" is not the good guard. Try for example "python3 -X
> > faulthandler": PyObject_Hash() is called before Py_Initialize() to add
> > "faulthandler" key into sys._xoptions dictionary.
> >
> > Today many Python internal functions are used before Python is
> > initialized... See the PEP 432 which proposes to improve the
> > situation:
> > http://www.python.org/dev/peps/pep-0432/
>
> That problem exists because our main function doesn't follow the C API
> usage rules, though. We require other embedding applications to be better
> behaved than that if they want support :)
>
> That said, while I'm mostly in favour of the PEP, I think setting the
> algorithm should be a private API for 3.4.
>
> I do agree that since the platform support for SipHash is slightly
> narrower,  we need to keep the existing hash algorithm around, make it
> relatively easy to enable and ensure we continue to test it on the build
> bots.
>
> I believe that last requirement for buildbot testing is the one that
> should drive the design of the private configuration API.
>
I'll defer to Nick for approval of this PEP.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Serhiy Storchaka

03.10.13 23:47, Guido van Rossum написав(ла):

On Thu, Oct 3, 2013 at 12:55 PM, Christian Heimes mailto:[email protected]>> wrote:

Am 03.10.2013 21:45, schrieb Guido van Rossum:
 > But fixing that shouldn't need all the extra stuff you're
 > proposing.

I have proposed some of the extra stuff for more flexibility, the rest
is for testing and debugging.


Hm, I don't think we need more infrastructure for this. As Antoine said,
if you're hacking on this you might as well edit the source.


What we could do is to move all hash-related stuff into separated .c and 
.h files.



 > SipHash: more secure and about same speed on most systems
 >
 > Same speed as what?

Same speed as the current algorithm in Python 3.3 and earlier.


OK, then I have no objection to switching to it, *if* the security issue
is really worth fixing. Otherwise it would be better to look for a hash
that is *faster*, given your assertion that the current hash is inefficient.


Actually same speed only for UCS1 string. For UCS2 and UCS4 strings it 
can be 5x to 10x slower [1]. But I don't known how it affects real programs.


[1] http://bugs.python.org/issue14621#msg175048


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] summing integer and class

2013-10-03 Thread Greg Ewing

Igor Vasilyev wrote:


class A():
def __add__(self, var):
print("I'm in A class")
return 5
a = A()
a+1
1+a

Execution:
python test.py
I'm in A class
Traceback (most recent call last):
  File "../../test.py", line 7, in 
1+a
TypeError: unsupported operand type(s) for +: 'int' and 'instance'


You need to define an __radd__ method for it to work
as a right-hand operand.

When we adding class to integer we have both slotv and slotw. x = 
slotv(v, w); -> returns Py_NotImplemented.

But in this case we should execute x = slotw(v, w);


Yes, but the wrapper that gets put in the type slot calls
__rxxx__ instead of __xxx__ if it's being called for the
right-hand operand.

--
Greg
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Gregory P. Smith
On Thu, Oct 3, 2013 at 11:42 AM, Christian Heimes wrote:

> Hi,
>
> some of you may have seen that I'm working on a PEP for a new hash API
> and new algorithms for hashing of bytes and str. The PEP has three major
> aspects. It introduces DJB's SipHash as secure hash algorithm, chances
> the hash API to process blocks of data instead characters and it adds an
> API to make the algorithm pluggable. A draft is already available [1].
>
> Now I got some negative feedback on the 'pluggable' aspect of the new
> PEP on Twitter [2]. I like to get feedback from you before I finalize
> the PEP.
>
> The PEP proposes a pluggable hash API for a couple of reasons. I like to
> give users of Python a chance to replace a secure hash algorithm with a
> faster hash algorithm. SipHash is about as fast as FNV for common cases
> as our implementation of FNV process only 8 to 32 bits per cycle instead
> of 32 or 64. I haven't actually benchmarked how a faster hash algorithm
> affects the a real program, though ...
>
> I also like to make it easier to replace the hash algorithm with a
> different one in case a vulnerability is found. With the new API vendors
> and embedders have an easy and clean way to use their own hash
> implementation or an optimized version that is more suitable for their
> platform, too. For example a mobile phone vendor could provide an
> optimized implementation with ARM NEON intrinsics.
>
>
> On which level should Python support a pluggable hash algorithm?
>
> 1) Compile time option: The hash code is compiled into Python's core.
> Embedders have to recompile Python with different options to replace the
> function.
>

This would be fine with me.


>
> 2) Library option: A hash algorithm can be added and one avaible hash
> algorithm can be set before Py_Initialize() is called for the first
> time. The approach gives embedders the chance the set their own
> algorithm without recompiling Python.
>

This would be more convenient. But I only want to replace the algorithm in
my code embedding CPython, not add one. So long as the performance impact
of supporting this is not usefully relevant, do that (let me supply the
algorithm to be used for each of bytes and str before Py_Initialize is
called).


>
> 3) Startup options: Like 2) plus an additional environment variable and
> command line argument to select an algorithm. With a startup option
> users can select a different algorithm themselves.
>

I can't imagine any reason I or anyone else would ever want this.

side note: In Python 2 and earlier the hash algorithm went to great lengths
to make unicode and bytes values that were the same (in at least ascii,
possibly latin-1 or utf-8 as well) hash to the same value. Is it safe to
assume that very annoying performance sapping invariant is no longer
required in Python 3 given that the whole default encoding for bytes to
unicode comparisons is gone?  (and thus the need for them to land in the
same dict hash bucket)

-gps
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Gregory P. Smith
On Thu, Oct 3, 2013 at 12:05 PM, Guido van Rossum  wrote:

> We already have adopted a feature that plugged most viable attacks on web
> apps, I think that's enough.
>

Actually... we did not do a very good job on that:
http://bugs.python.org/issue14621

The point of allowing alternates is to let people with needs choose
something else if they want without having to jump through hoops of
modifying the guts of Python to do it. I don't expect python as shipped by
most OS distros to use anything other than our default.

-gps
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Gregory P. Smith
On Thu, Oct 3, 2013 at 1:06 PM, Victor Stinner wrote:

> 2013/10/3 Christian Heimes :
> > A hash algorithm can be added and one avaible hash
> > algorithm can be set before Py_Initialize() is called for the first
> > time.
>
> "Py_Initialize" is not the good guard. Try for example "python3 -X
> faulthandler": PyObject_Hash() is called before Py_Initialize() to add
> "faulthandler" key into sys._xoptions dictionary.
>
> Today many Python internal functions are used before Python is
> initialized... See the PEP 432 which proposes to improve the
> situation:
> http://www.python.org/dev/peps/pep-0432/


then I withdraw my desire for setting it before that.  compile time is
fine.  but how would you make that usefully easier than the existing method
of replacing the two functions in our bytes and str implementations?

if you want a compile time flag, perhaps just call it --enable-sip-hash and
get it over with since that's what we really want. ;)

-gps


>
>
> Victor
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/greg%40krypto.org
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make str/bytes hash algorithm pluggable?

2013-10-03 Thread Armin Rigo
Hi Guido,

On Thu, Oct 3, 2013 at 10:47 PM, Guido van Rossum  wrote:
> Sounds a bit like some security researchers drumming up business. If you can
> run the binary, presumably you can also recover the seed by looking in
> /proc, right? Or use ctypes or something. This demonstration seems of
> academic interest only.

I'll not try to defend the opposite point of view very actively, but
let me just say that, in my opinion, your objection is not valid.  It
is broken the same way as a different objection, which would claim
that Python can be made sandbox-safe without caring about the numerous
segfault cases.  They are all very obscure for sure; I tried at some
point to list them in Lib/test/crashers.  I gave up when people
started deleting the files because they no longer crashed on newer
versions, just because details changed --- but not because the general
crash they explained was in any way fixed...  Anyway, my point is that
most segfaults can, given enough effort, be transformed into a single,
well-documented tool to conduct a large class of attacks.

The hash issue is similar.  It should be IMHO either ignored (which is
fine for a huge fraction of users), or seriously fixed by people with
the correctly pessimistic approach.  The current hash randomization is
simply not preventing anything; someone posted long ago a way to
recover bit-by-bit the hash randomized used by a remote web program in
Python running on a server.  The only benefit of this hash
randomization option (-R) was to say to the press that Python fixed
very quickly the problem when it was mediatized :-/

This kind of security issues should never be classified as "academic
interest only".  Instead they can be classified as "it will take weeks
/ months / years before some crazy man manages to put together a
general attack script, but likely, someone will eventually".

>From this point of view I'm saluting Christian's effort, even if I
prefer to stay far away from this kind of issues myself :-)


A bientôt,

Armin.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com