andy petrella <[email protected]> writes:
> Oh I was almost sure that lookup was optimized using the partition info

It does use the partitioner to run only one task, but within that task it has 
to scan the entire partition:
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L710

Ankur

Reply via email to