- After loading large RDDs that are > 60-70% of the total memory, (k,v)
   operations like finding uniques/distinct, GroupByKey and SetOperations
   would be network bound.
   - A multi-stage Map-Reduce DAG should be a good test. When we tried this
   for Hadoop, we used examples from Genomics. Has anyone tried BLAST with
   Spark ?

Cheers
<k/>


On Fri, Jun 27, 2014 at 5:07 PM, Ryan Compton <compton.r...@gmail.com>
wrote:

> We are going to upgrade our cluster from 1g to 10g ethernet. I'd like
> to run some benchmarks before and after the upgrade. Can anyone
> suggest a few typical Spark workloads that are network-bound?
>

Reply via email to