Why is this on nanog? Yes it is "possible". But the CPU use and time will be absurd compared to just sending the data across the network.
I would say attempting this with even a small file will end up laughable. Passwords are just several bytes and have significant lifetimes. -- Justin Cook On 19 May 2011 01:03, "Heath Jones" <hj1...@gmail.com> wrote: > My point here is it IS possible to transfer just a hash and counter value > and effectively generate identical data at the remote end. > The limit that will be hit is the difficulty of generating and comparing > hash values with current processing power. > > I'm proposing iterating through generated data up until the actual data. > It's not even a storage issue, as once you have incremented the data you > don't need to store old data or hash values - just the counter. No massive > hash tables. > > It's a CPU issue. > > Heath > > On 19 May 2011 00:42, <valdis.kletni...@vt.edu> wrote: > >> On Thu, 19 May 2011 00:26:26 BST, Heath Jones said: >> > I wonder if this is possible: >> > >> > - Take a hash of the original file. Keep a counter. >> > - Generate data in some sequential method on sender side (for example >> simply >> > starting at 0 and iterating until you generate the same as the original >> > data) >> > - Each time you iterate, take the hash of the generated data. If it >> matches >> > the hash of the original file, increment counter. >> > - Send the hash and the counter value to recipient. >> > - Recipient performs same sequential generation method, stopping when >> > counter reached. >> >> MD5 is a 128 bit hash. >> >> 2^128 is 340,282,366,920,938,463,463,374,607,431,768,211,456 - you're >> welcome >> to iterate that many times to find a duplicate. You may get lucky and get a >> hit >> in the first trillion or so attempts - but you may get unlucky and not get >> a >> hit until the *last* few trillion attempts. On average you'll have to >> iterate >> about half that huge number before you get a hit. >> >> And it's lossy - if you hash all the possible 4K blocks with MD5, you'll >> find >> that each of those 2^128 hashes has been hit about 256 times - and no >> indication in the hash of *which* of the 256 colliding 4K blocks you have >> on >> this iteration. (The only reason that companies can do block-level >> de-duplication by saving a hash as an index to one copy shared by all >> blocks >> with the same hash value is because you have a *very small* fraction of the >> possibilities covered, so if you saved a 4K block of data from somebody's >> system32 folder under a given MD5 hash, it's *far* more likely that another >> block with that same hash is from another copy of another identical >> system32 >> folder, than it is an actual accidental collision.) >> >> Protip: A good hash function is by definition one-way - given the data, >> it's >> easy to generate the hash - but reversing it to find the "pre-image" (the >> data >> that *generated* the hash) is massively difficult. >> >>