Today, I decided to check out Django's new password validation functionality (which is a great feature, btw).
I noticed there was a CommonPasswordValidator, which mentions "1000 common passwords"... Part 1. The first thing that came to mind was, how would one compile a list of 1000 common passwords, unless they maintained a rainbow table of millions of possible passwords AND had access to a large corpus of leaked password hashes (or databases of plain text passwords)? Here's where it's worth noting the "This is Facebook, so I'll create a real password" vs "This is just some forum I'll probably never come back to, so I'll just use hunter2" phenomenon. Now, given the second part of the question (large corpus of data), or even more so, the plain text case, where does intuition tell you that the majority of this kind of data would likely come from? Facebook / Twitter / online banks? Or, forums and defunct website? I think with that, I've established the potentially dubious potential for the notion of "N most common passwords" being even remotely accurately established. Part 2. So, with the above thoughts in mind, I decided to have a look at the passwords Django is using and find their origin (did they come from a compiled list of "leaked" databases or something else?). The list (plain text: https://gist.github.com/anonymous/59e9eb2935165d7b0fa9), I found after a quick search, is copied wholesale from a website called passwordrandom.com. The website appears to be owned by one Dmitriy Koshevoy in Ukraine, but other than that I know nothing about it. The list that Django uses is from this page specifically http://www.passwordrandom.com/most-popular-passwords - purporting to have the 10,000 most commonly used passwords (in order!), but says nothing about where they came from. I figured, maybe this website is quite popular for password validation / generation, and Dmitriy has compiled... seems like a pretty bad idea to give them your password, but oh wel. Except that passwordrandom.com has basically no traffic, according to SimilarWeb, Compete, and Alexa. Side note: passwordrandom also features this strange and suspicious joke http://www.passwordrandom.com/password-database. Hopefully nobody has entered their real password there or anywhere else on the website, or used the site to generate a password, lest they lose it to the public domain, since the website doesn't even employ TLS. Conclusion. With all that, I'm now wondering how this list of "common" passwords made it into Django's code base. Perhaps, it should be removed, since, as I've established above, it provides no verifiable value or security. It could just as well be replaced with a configuration option (list setting or file path setting), to maintain backwards compatibility (and warm fuzzies for those who think *they* know the most common passwords?). -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/c0cfdd0d-98f1-49a4-a8d3-e50ad56b4847%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
