Hi We have a RDD<UserId> that needs to be mapped with information from HBase, where the exact key is the user id.
What's the different alternatives for doing this? - Is it possible to do HBase.get() requests from a map function in Spark? - Or should we join RDDs with all full HBase table scan? I ask because full table scans feels inefficient, especially if the input RDD<UserId> is really small compared to the full table. But I realize that a full table scan may not be what happens in reality? Cheers, -Kristoffer --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org