[Python-Dev] Hash collision security issue (now public)

2011-12-28 Thread Michael Foord
Hello all,

A paper (well, presentation) has been published highlighting security problems 
with the hashing algorithm (exploiting collisions) in many programming 
languages Python included:

 
http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf

Although it's a security issue I'm posting it here because it is now public and 
seems important.

The issue they report can cause (for example) handling an http post to consume 
horrible amounts of cpu. For Python the figures they quoted:

reasonable-sized attack strings only for 32 bits Plone has max. POST 
size of 1 MB
7 minutes of CPU usage for a 1 MB request
~20 kbits/s → keep one Core Duo core busy

This was apparently reported to the security list, but hasn't been responded to 
beyond an acknowledgement on November 24th (the original report didn't make it 
onto the security list because it was held in a moderation queue). 

The same vulnerability was reported against various languages and web 
frameworks, and is already fixed in some of them.

Their recommended fix is to randomize the hash function.

All the best,

Michael


--
http://www.voidspace.org.uk/


May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing 
http://www.sqlite.org/different.html





___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hash collision security issue (now public)

2011-12-28 Thread Jesse Noller


On Wednesday, December 28, 2011 at 8:28 PM, Michael Foord wrote:

> Hello all,
>  
> A paper (well, presentation) has been published highlighting security 
> problems with the hashing algorithm (exploiting collisions) in many 
> programming languages Python included:
>  
> http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf
>  
> Although it's a security issue I'm posting it here because it is now public 
> and seems important.
>  
> The issue they report can cause (for example) handling an http post to 
> consume horrible amounts of cpu. For Python the figures they quoted:
>  
> reasonable-sized attack strings only for 32 bits Plone has max. POST size of 
> 1 MB
> 7 minutes of CPU usage for a 1 MB request
> ~20 kbits/s → keep one Core Duo core busy
>  
> This was apparently reported to the security list, but hasn't been responded 
> to beyond an acknowledgement on November 24th (the original report didn't 
> make it onto the security list because it was held in a moderation queue).  
>  
> The same vulnerability was reported against various languages and web 
> frameworks, and is already fixed in some of them.
>  
> Their recommended fix is to randomize the hash function.
>  
> All the best,
>  
> Michael
>  
Back up link for the PDF:
http://dl.dropbox.com/u/1374/2007_28C3_Effective_DoS_on_web_application_platforms.pdf

Ocert disclosure:
http://www.ocert.org/advisories/ocert-2011-003.html

jesse  


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hash collision security issue (now public)

2011-12-28 Thread Jesse Noller


On Wednesday, December 28, 2011 at 8:37 PM, Jesse Noller wrote:

>  
>  
> On Wednesday, December 28, 2011 at 8:28 PM, Michael Foord wrote:
>  
> > Hello all,
> >  
> > A paper (well, presentation) has been published highlighting security 
> > problems with the hashing algorithm (exploiting collisions) in many 
> > programming languages Python included:
> >  
> > http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf
> >  
> > Although it's a security issue I'm posting it here because it is now public 
> > and seems important.
> >  
> > The issue they report can cause (for example) handling an http post to 
> > consume horrible amounts of cpu. For Python the figures they quoted:
> >  
> > reasonable-sized attack strings only for 32 bits Plone has max. POST size 
> > of 1 MB
> > 7 minutes of CPU usage for a 1 MB request
> > ~20 kbits/s → keep one Core Duo core busy
> >  
> > This was apparently reported to the security list, but hasn't been 
> > responded to beyond an acknowledgement on November 24th (the original 
> > report didn't make it onto the security list because it was held in a 
> > moderation queue).  
> >  
> > The same vulnerability was reported against various languages and web 
> > frameworks, and is already fixed in some of them.
> >  
> > Their recommended fix is to randomize the hash function.
> >  
> > All the best,
> >  
> > Michael
>  
> Back up link for the PDF:
> http://dl.dropbox.com/u/1374/2007_28C3_Effective_DoS_on_web_application_platforms.pdf
>  
> Ocert disclosure:
> http://www.ocert.org/advisories/ocert-2011-003.html

And more analysis/information:

http://cryptanalysis.eu/blog/2011/12/28/effective-dos-attacks-against-web-application-plattforms-hashdos/
  


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hash collision security issue (now public)

2011-12-28 Thread Eric Snow
On Wed, Dec 28, 2011 at 6:28 PM, Michael Foord
 wrote:
> Hello all,
>
> A paper (well, presentation) has been published highlighting security 
> problems with the hashing algorithm (exploiting collisions) in many 
> programming languages Python included:
>
>         
> http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf
>
> Although it's a security issue I'm posting it here because it is now public 
> and seems important.
>
> The issue they report can cause (for example) handling an http post to 
> consume horrible amounts of cpu. For Python the figures they quoted:
>
>        reasonable-sized attack strings only for 32 bits Plone has max. POST 
> size of 1 MB
>        7 minutes of CPU usage for a 1 MB request
>        ~20 kbits/s → keep one Core Duo core busy
>
> This was apparently reported to the security list, but hasn't been responded 
> to beyond an acknowledgement on November 24th (the original report didn't 
> make it onto the security list because it was held in a moderation queue).
>
> The same vulnerability was reported against various languages and web 
> frameworks, and is already fixed in some of them.
>
> Their recommended fix is to randomize the hash function.

Ironically, this morning I ran across a discussion from about 8 years
ago on basically the same thing:

http://mail.python.org/pipermail/python-dev/2003-May/035874.html

 From what I read in the thread, it didn't seem like anyone here was
too worried about it.  Does this new research change anything?

-eric
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hash collision security issue (now public)

2011-12-28 Thread Alex Gaynor
A few thoughts on this:

a) This is not a new issue, I'm curious what the new interest is in it.

b) Whatever the solution to this is, it is *not* CPython specific, any decision
should be reflected in the Python language spec IMO, if CPython has the semantic
that dicts aren't vulnerable to hash collision then users *will* rely on this
and another implementation having a different (valid) behavior opens up users to
security issues.

c) I'm not convinced a randomized hash is appropriate for the default dict, for
a number of reasons: it's a performance hit on every dict operations, using a
per-process seed means you can't compile the hash of an obj at Python's compile
time, a per-dict seed inhibits a bunch of other optimizations.  These may not be
relevant to CPython, but they are to PyPy and probably the invoke-dynamic work
on Jython (pursuant to point b).

Therefore I think these should be considered application issues, since request
limiting is difficult and error prone, I'd recommend the Python stdlib including
a non-hash based map (such as a binary tree).

Alex

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hash collision security issue (now public)

2011-12-28 Thread Raymond Hettinger
FWIW, Uncle Timmy considers the non-randomized hashes to be a virtue.
It is believed that they give us better-than-random results for commonly
encountered datasets.  A change to randomized hashes would have a
negative performance impact on those cases.

Also, randomizing the hash wreaks havoc on doctests, book examples
not matching actual dict reprs, and on efforts by users to optimize
the insertion order into dicts with frequent lookups.


Raymond





On Dec 28, 2011, at 5:28 PM, Michael Foord wrote:

> Hello all,
> 
> A paper (well, presentation) has been published highlighting security 
> problems with the hashing algorithm (exploiting collisions) in many 
> programming languages Python included:
> 
>
> http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf
> 
> Although it's a security issue I'm posting it here because it is now public 
> and seems important.
> 
> The issue they report can cause (for example) handling an http post to 
> consume horrible amounts of cpu. For Python the figures they quoted:
> 
>   reasonable-sized attack strings only for 32 bits Plone has max. POST 
> size of 1 MB
>   7 minutes of CPU usage for a 1 MB request
>   ~20 kbits/s → keep one Core Duo core busy
> 
> This was apparently reported to the security list, but hasn't been responded 
> to beyond an acknowledgement on November 24th (the original report didn't 
> make it onto the security list because it was held in a moderation queue). 
> 
> The same vulnerability was reported against various languages and web 
> frameworks, and is already fixed in some of them.
> 
> Their recommended fix is to randomize the hash function.
> 
> All the best,
> 
> Michael
> 
> 
> --
> http://www.voidspace.org.uk/
> 
> 
> May you do good and not evil
> May you find forgiveness for yourself and forgive others
> May you share freely, never taking more than you give.
> -- the sqlite blessing 
> http://www.sqlite.org/different.html
> 
> 
> 
> 
> 
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/raymond.hettinger%40gmail.com

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hash collision security issue (now public)

2011-12-28 Thread Christian Heimes
Am 29.12.2011 02:37, schrieb Jesse Noller:
> Back up link for the PDF:
> http://dl.dropbox.com/u/1374/2007_28C3_Effective_DoS_on_web_application_platforms.pdf
> 
> Ocert disclosure:
> http://www.ocert.org/advisories/ocert-2011-003.html

>From http://www.nruns.com/_downloads/advisory28122011.pdf

---
Python uses a hash function which is very similar to DJBX33X, which can
be broken using a
meet-in-the-middle attack. It operates on register size and is thus
different for 64 and 32 bit
machines. While generating multi-collisions efficiently is also possible
for the 64 bit version
of the function, the resulting colliding strings are too large to be
relevant for anything more
than an academic attack.

Plone as the most prominent Python web framework accepts 1 MB of POST
data, which it
parses in about 7 minutes of CPU time in the worst case.
This gives an attacker with about 20 kbit/s the possibility to keep one
Core Duo core
constantly busy. If the attacker is in the position to have a Gigabit
line available, he can keep
about 50.000 Core Duo cores busy.
---

If I remember correctly CPython uses the long for its hash function so
64bit Windows uses a 32bit hash.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hash collision security issue (now public)

2011-12-28 Thread Christian Heimes
Am 29.12.2011 03:09, schrieb Raymond Hettinger:
> FWIW, Uncle Timmy considers the non-randomized hashes to be a virtue.
> It is believed that they give us better-than-random results for commonly
> encountered datasets.  A change to randomized hashes would have a
> negative performance impact on those cases.
> 
> Also, randomizing the hash wreaks havoc on doctests, book examples
> not matching actual dict reprs, and on efforts by users to optimize
> the insertion order into dicts with frequent lookups.

My five cents on the topic:

I totally concur with Raymound. He, Tim and all the others did a
fantastic job with the dict implementation and optimization. We
shouldn't overreact and mess with the current hashing and dict code just
because a well-known and old attack vector pops up again. The dict code
is far too crucial for Python's overall performance. However the issue
should be documented in our docs.

I've been dealing with web stuff and security for almost a decade. I've
seen far worse attack vectors. This one can easily be solved with a
couple of lines of Python code. For example Application developers can
limit the maximum amount of POST parameters to a sensible amount and
limit the length of each key, too. The issue less severe on platforms
with 64bit hashes, so it won't affect most people. I think only 32bit
Unix and Windows in general (32bit long) are in trouble.

CPython could aid developers with a special subclass of dict. The
crucial lookup function is already overwrite-able per dict instance and
on subclasses of dict through PyDictObj's struct member PyDictEntry
*(*ma_lookup)(PyDictObject *mp, PyObject *key, long hash). For example
specialized subclass could limit the seach for a free slot to n
recursions or choose to ignore the hash argument and calculate its own
hash of the key.

Christian
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hash collision security issue (now public)

2011-12-28 Thread Brian Curtin
On Wed, Dec 28, 2011 at 19:51, Alex Gaynor  wrote:
> A few thoughts on this:
>
> a) This is not a new issue, I'm curious what the new interest is in it.

Well they (the presenters of the report) had to be accepted to that
conference for *something*, otherwise we wouldn't know they exist.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com