[Python-Dev] Hash collision security issue (now public)
Hello all, A paper (well, presentation) has been published highlighting security problems with the hashing algorithm (exploiting collisions) in many programming languages Python included: http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf Although it's a security issue I'm posting it here because it is now public and seems important. The issue they report can cause (for example) handling an http post to consume horrible amounts of cpu. For Python the figures they quoted: reasonable-sized attack strings only for 32 bits Plone has max. POST size of 1 MB 7 minutes of CPU usage for a 1 MB request ~20 kbits/s → keep one Core Duo core busy This was apparently reported to the security list, but hasn't been responded to beyond an acknowledgement on November 24th (the original report didn't make it onto the security list because it was held in a moderation queue). The same vulnerability was reported against various languages and web frameworks, and is already fixed in some of them. Their recommended fix is to randomize the hash function. All the best, Michael -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Hash collision security issue (now public)
On Wednesday, December 28, 2011 at 8:28 PM, Michael Foord wrote: > Hello all, > > A paper (well, presentation) has been published highlighting security > problems with the hashing algorithm (exploiting collisions) in many > programming languages Python included: > > http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf > > Although it's a security issue I'm posting it here because it is now public > and seems important. > > The issue they report can cause (for example) handling an http post to > consume horrible amounts of cpu. For Python the figures they quoted: > > reasonable-sized attack strings only for 32 bits Plone has max. POST size of > 1 MB > 7 minutes of CPU usage for a 1 MB request > ~20 kbits/s → keep one Core Duo core busy > > This was apparently reported to the security list, but hasn't been responded > to beyond an acknowledgement on November 24th (the original report didn't > make it onto the security list because it was held in a moderation queue). > > The same vulnerability was reported against various languages and web > frameworks, and is already fixed in some of them. > > Their recommended fix is to randomize the hash function. > > All the best, > > Michael > Back up link for the PDF: http://dl.dropbox.com/u/1374/2007_28C3_Effective_DoS_on_web_application_platforms.pdf Ocert disclosure: http://www.ocert.org/advisories/ocert-2011-003.html jesse ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Hash collision security issue (now public)
On Wednesday, December 28, 2011 at 8:37 PM, Jesse Noller wrote: > > > On Wednesday, December 28, 2011 at 8:28 PM, Michael Foord wrote: > > > Hello all, > > > > A paper (well, presentation) has been published highlighting security > > problems with the hashing algorithm (exploiting collisions) in many > > programming languages Python included: > > > > http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf > > > > Although it's a security issue I'm posting it here because it is now public > > and seems important. > > > > The issue they report can cause (for example) handling an http post to > > consume horrible amounts of cpu. For Python the figures they quoted: > > > > reasonable-sized attack strings only for 32 bits Plone has max. POST size > > of 1 MB > > 7 minutes of CPU usage for a 1 MB request > > ~20 kbits/s → keep one Core Duo core busy > > > > This was apparently reported to the security list, but hasn't been > > responded to beyond an acknowledgement on November 24th (the original > > report didn't make it onto the security list because it was held in a > > moderation queue). > > > > The same vulnerability was reported against various languages and web > > frameworks, and is already fixed in some of them. > > > > Their recommended fix is to randomize the hash function. > > > > All the best, > > > > Michael > > Back up link for the PDF: > http://dl.dropbox.com/u/1374/2007_28C3_Effective_DoS_on_web_application_platforms.pdf > > Ocert disclosure: > http://www.ocert.org/advisories/ocert-2011-003.html And more analysis/information: http://cryptanalysis.eu/blog/2011/12/28/effective-dos-attacks-against-web-application-plattforms-hashdos/ ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Hash collision security issue (now public)
On Wed, Dec 28, 2011 at 6:28 PM, Michael Foord wrote: > Hello all, > > A paper (well, presentation) has been published highlighting security > problems with the hashing algorithm (exploiting collisions) in many > programming languages Python included: > > > http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf > > Although it's a security issue I'm posting it here because it is now public > and seems important. > > The issue they report can cause (for example) handling an http post to > consume horrible amounts of cpu. For Python the figures they quoted: > > reasonable-sized attack strings only for 32 bits Plone has max. POST > size of 1 MB > 7 minutes of CPU usage for a 1 MB request > ~20 kbits/s → keep one Core Duo core busy > > This was apparently reported to the security list, but hasn't been responded > to beyond an acknowledgement on November 24th (the original report didn't > make it onto the security list because it was held in a moderation queue). > > The same vulnerability was reported against various languages and web > frameworks, and is already fixed in some of them. > > Their recommended fix is to randomize the hash function. Ironically, this morning I ran across a discussion from about 8 years ago on basically the same thing: http://mail.python.org/pipermail/python-dev/2003-May/035874.html From what I read in the thread, it didn't seem like anyone here was too worried about it. Does this new research change anything? -eric ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Hash collision security issue (now public)
A few thoughts on this: a) This is not a new issue, I'm curious what the new interest is in it. b) Whatever the solution to this is, it is *not* CPython specific, any decision should be reflected in the Python language spec IMO, if CPython has the semantic that dicts aren't vulnerable to hash collision then users *will* rely on this and another implementation having a different (valid) behavior opens up users to security issues. c) I'm not convinced a randomized hash is appropriate for the default dict, for a number of reasons: it's a performance hit on every dict operations, using a per-process seed means you can't compile the hash of an obj at Python's compile time, a per-dict seed inhibits a bunch of other optimizations. These may not be relevant to CPython, but they are to PyPy and probably the invoke-dynamic work on Jython (pursuant to point b). Therefore I think these should be considered application issues, since request limiting is difficult and error prone, I'd recommend the Python stdlib including a non-hash based map (such as a binary tree). Alex ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Hash collision security issue (now public)
FWIW, Uncle Timmy considers the non-randomized hashes to be a virtue. It is believed that they give us better-than-random results for commonly encountered datasets. A change to randomized hashes would have a negative performance impact on those cases. Also, randomizing the hash wreaks havoc on doctests, book examples not matching actual dict reprs, and on efforts by users to optimize the insertion order into dicts with frequent lookups. Raymond On Dec 28, 2011, at 5:28 PM, Michael Foord wrote: > Hello all, > > A paper (well, presentation) has been published highlighting security > problems with the hashing algorithm (exploiting collisions) in many > programming languages Python included: > > > http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf > > Although it's a security issue I'm posting it here because it is now public > and seems important. > > The issue they report can cause (for example) handling an http post to > consume horrible amounts of cpu. For Python the figures they quoted: > > reasonable-sized attack strings only for 32 bits Plone has max. POST > size of 1 MB > 7 minutes of CPU usage for a 1 MB request > ~20 kbits/s → keep one Core Duo core busy > > This was apparently reported to the security list, but hasn't been responded > to beyond an acknowledgement on November 24th (the original report didn't > make it onto the security list because it was held in a moderation queue). > > The same vulnerability was reported against various languages and web > frameworks, and is already fixed in some of them. > > Their recommended fix is to randomize the hash function. > > All the best, > > Michael > > > -- > http://www.voidspace.org.uk/ > > > May you do good and not evil > May you find forgiveness for yourself and forgive others > May you share freely, never taking more than you give. > -- the sqlite blessing > http://www.sqlite.org/different.html > > > > > > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/raymond.hettinger%40gmail.com ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Hash collision security issue (now public)
Am 29.12.2011 02:37, schrieb Jesse Noller: > Back up link for the PDF: > http://dl.dropbox.com/u/1374/2007_28C3_Effective_DoS_on_web_application_platforms.pdf > > Ocert disclosure: > http://www.ocert.org/advisories/ocert-2011-003.html >From http://www.nruns.com/_downloads/advisory28122011.pdf --- Python uses a hash function which is very similar to DJBX33X, which can be broken using a meet-in-the-middle attack. It operates on register size and is thus different for 64 and 32 bit machines. While generating multi-collisions efficiently is also possible for the 64 bit version of the function, the resulting colliding strings are too large to be relevant for anything more than an academic attack. Plone as the most prominent Python web framework accepts 1 MB of POST data, which it parses in about 7 minutes of CPU time in the worst case. This gives an attacker with about 20 kbit/s the possibility to keep one Core Duo core constantly busy. If the attacker is in the position to have a Gigabit line available, he can keep about 50.000 Core Duo cores busy. --- If I remember correctly CPython uses the long for its hash function so 64bit Windows uses a 32bit hash. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Hash collision security issue (now public)
Am 29.12.2011 03:09, schrieb Raymond Hettinger: > FWIW, Uncle Timmy considers the non-randomized hashes to be a virtue. > It is believed that they give us better-than-random results for commonly > encountered datasets. A change to randomized hashes would have a > negative performance impact on those cases. > > Also, randomizing the hash wreaks havoc on doctests, book examples > not matching actual dict reprs, and on efforts by users to optimize > the insertion order into dicts with frequent lookups. My five cents on the topic: I totally concur with Raymound. He, Tim and all the others did a fantastic job with the dict implementation and optimization. We shouldn't overreact and mess with the current hashing and dict code just because a well-known and old attack vector pops up again. The dict code is far too crucial for Python's overall performance. However the issue should be documented in our docs. I've been dealing with web stuff and security for almost a decade. I've seen far worse attack vectors. This one can easily be solved with a couple of lines of Python code. For example Application developers can limit the maximum amount of POST parameters to a sensible amount and limit the length of each key, too. The issue less severe on platforms with 64bit hashes, so it won't affect most people. I think only 32bit Unix and Windows in general (32bit long) are in trouble. CPython could aid developers with a special subclass of dict. The crucial lookup function is already overwrite-able per dict instance and on subclasses of dict through PyDictObj's struct member PyDictEntry *(*ma_lookup)(PyDictObject *mp, PyObject *key, long hash). For example specialized subclass could limit the seach for a free slot to n recursions or choose to ignore the hash argument and calculate its own hash of the key. Christian ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Hash collision security issue (now public)
On Wed, Dec 28, 2011 at 19:51, Alex Gaynor wrote: > A few thoughts on this: > > a) This is not a new issue, I'm curious what the new interest is in it. Well they (the presenters of the report) had to be accepted to that conference for *something*, otherwise we wouldn't know they exist. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
