subject:"Re\: IndexedRDD"

Re: indexedrdd and radix tree: how to search indexedRDD using all prefixes?

2015-11-24 Thread Mina

This is what a Radix tree returns -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/indexedrdd-and-radix-tree-how-to-search-indexedRDD-using-all-prefixes-tp25459p25460.html Sent from the Apache Spark User List mailing list archive at Nabble.com. -

Re: IndexedRDD

2015-01-13 Thread Jem Tucker

Hi, Thanks for the replies, I guess I was hoping for a bit better than linear scaling, this was performing IndexedRDD.join(RDD)((id, a, b) => (a, b)). In each join every row in the smaller table is joined to one in the lookup. I ran the same test with standard RDD joins and there was barely any ti

Re: IndexedRDD

2015-01-13 Thread Jerry Lam

Hi guys, I'm interested in the IndexedRDD too. How many rows in the big table that matches the small table in every run? If the number of rows stay constant, then I think Jem wants the runtime to stay about constant (i.e. ~ 0.6 second for all cases). However, I agree with Andrew. The performance w

Re: IndexedRDD

2015-01-13 Thread Andrew Ash

Hi Jem, Linear time in scaling on the big table doesn't seem that surprising to me. What were you expecting? I assume you're doing normalRDD.join(indexedRDD). If you were to replace the indexedRDD with a normal RDD, what times do you get? On Tue, Jan 13, 2015 at 5:35 AM, Jem Tucker wrote: >