Spark is obviously well-suited to crunching massive amounts of data. How
about to crunch massive amounts of numbers?

A few years ago I put together a little demo for some co-workers to
demonstrate the dangers of using SHA1
<http://codahale.com/how-to-safely-store-a-password/> to hash and store
passwords. Part of the demo included a live brute-forcing of hashes to show
how SHA1's speed made it unsuitable for hashing passwords.

I think it would be cool to redo the demo, but utilize the power of a
cluster managed by Spark to crunch through hashes even faster.

But how would you do that with Spark (if at all)?

I'm guessing you would create an RDD that somehow defined the search space
you're going to go through, and then partition it to divide the work up
equally amongst the cluster's cores. Does that sound right?

I wonder if others have already used Spark for computationally-intensive
workloads like this, as opposed to just data-intensive ones.

Nick




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Using-Spark-to-crack-passwords-tp7437.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to