hzh...@gmail.com wrote: >> So it seems we both misunderstood the problem. >> >> I didn't read the top level article until now, and reading it, I can't make >> sense of it. >> > > Seems that you should read the whole thing before making a post, or > else you cannot know what we are talking about. > Steven doesn't misunderstand me. We are talking about what I need, and > he tries to help. > > > >>> "Given the function hashlib.sha256, enumerate all the possible inputs >>> that give the hexadecimal result >>> 0a2591aaf3340ad92faecbc5908e74d04b51ee5d2deee78f089f1607570e2e91." >> I tried some "parrot" variants but no dice. :-( >> >> [snip] >> > > This is a hash collision problem. Nobody has proved that SHA-256 is > collision free, even not in the random oracle model, because people > always suppose that a random oracle exists, and make hash function its > substitution. That means it may be broken someday. And any provable > security based on random oracle model is not secure. > It's very easy to prove that no hash function is collision-free, since the domain (all possible inputs) is much larger than the range (all possible outputs). Hence there must be many inputs that map to the same output.
A *good* hash function is unpredictable enough to make finding two colliding strings impractical - and even the best hash functions that cryptographers could devise at the time have been broken. We should remember that "broken" to a cryptographer means something rather different than it does in common usage, so a broken scheme need not necessarily be dropped immediately - one would just stop using it in new systems. > >>> I'm suggesting that, in general, there's no way to tell in advance which >>> regexes will be easy and which will be hard, and even when they are easy, >>> the enumeration will often be infinite. > > It is hard to tell in advance. However, we can add some timing limit > or counting limit, to make it an algorithm, which can halt. For > example, whenever the program outputs more than 1000000 expressions > that match the input regex, we can halt because that exceeds our > limit. But surely this is not efficient because of the post-decision. > >> Essentially, any regexp that includes '+' or '*' (directly or via e.g. >> notation >> that denotes "digit sequence") yields an infinite number of strings. > > Infinity is really relative, not absolute. It is relative to the > computing speed. For example, the regex '^[0|1]{2048}$' is rather > simple and doesn't contain '+' or '$', but trying to output all > expressions that match it has a complexity of 2^2048. If we can do > that, then we can break RSA-2048. > We must face the reality . > I have always understood that there's a pretty real distinction between "finite" and "infinite". Are you telling me I am wrong, or are you merely saying that some finite cases might just as well be infinite for practical purposes? And I really don't see how simple enumeration of range(2^2048) breaks RSA-2048, since that problem requires you to find two factors which, when multiplied together, give that specific value. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 PyCon is coming! Atlanta, Feb 2010 http://us.pycon.org/ Holden Web LLC http://www.holdenweb.com/ UPCOMING EVENTS: http://holdenweb.eventbrite.com/ -- http://mail.python.org/mailman/listinfo/python-list