On Fri, Jul 26, 2024, at 15:02, Tim Düsterhus wrote: > HI > > On 7/26/24 14:50, Rob Landers wrote: > >>> $_SESSION['token'] = md5(uniqid(mt_rand(), true)); > >> > >> *Exactly* the md5-uniqid construction that is called out as unsafe in > >> the RFC and used in a security context. > > > > In regards to hashing, this is likely fine; for now. There still isn't an > > arbitrary pre-image attack on md5 (that I'm aware of). Can you create a > > random file with a matching hash? Yes, in a few seconds, on modern > > hardware. But you cannot yet make it have arbitrary contents in our > > lifetime. The NSA probably has something like this though, but if so, this > > isn't widely known. > > Neither collision-, nor pre-image resistance is relevant here. The > attack vector is a brute force attack / an attacker guessing the token > rather than the token's contents.
You do realize that GUID and md5 hashes are the same size? One does not simply "guess" a GUID or an md5 hash. gravatar used md5 until a couple of years ago, and had millions? billions? of emails addresses and zero collisions. > > That being said, this is just randomly creating a random id without leaking > > it's internal construction, no different than putting an md5 in a UUID-v8. > > The real issue here is the use of uniqid() and rand(), making it quite > > likely (at scale, at least) that a session id will overlap with another > > session id. > > The point is that it showcases a fundamental misunderstanding of what > MD5 (or really any other hash algorithm) does for you. The application > of the MD5 does not make the token more random or more unique or > whatever positive adjective you would like to use. It would be equally > strong (or rather weak) if the output of `uniqid(mt_rand(), true)` was > used directly. Yes, it does, but probably not how you think. It would be much weaker to leak the internal construction (uniqid(mt_rand(), true)) because then someone could literally guess a working id if they knew when the id was generated (depending on the size of mt_rand, rate limits, etc). By wrapping it in an md5, it is literally unguessable how it is constructed, but the construction is still crap in this case. > > As per Kerckhoffs's principle, the security of the algorithm must not > rely on the attacker not knowing how it's implemented. Given how > prevalent constructions like the above are, an attacker could make an > educated guess about how it looks like and match their own token against > a precomputed table to find out if it matches. In this example, an ID is being constructed. If it needs uniqueness, the ID is being constructed incorrectly, but if you could argue that a GUID would fit the bill here, md5 has more "entropy" than a GUIDv4. But due to how the md5 is constructed, it actually has less entropy. So, I think we both can agree that the construction is crap. However, the usage of md5 doesn't matter here. If it really bothers you, craft a GUIDv8 from it. But to Kerckhoffs's principle, that is in regards to encryption ... this is not encryption. — Rob