Re: indexedrdd and radix tree: how to search indexedRDD using all prefixes?

2015-11-24 Thread Mina
This is what a Radix tree returns -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/indexedrdd-and-radix-tree-how-to-search-indexedRDD-using-all-prefixes-tp25459p25460.html Sent from the Apache Spark User List mailing list archive at Nabble.com

indexedrdd and radix tree: how to search indexedRDD using all prefixes?

2015-11-24 Thread Mina
Hello, I have a question about radix tree (PART) implementation in Spark, IndexedRDD. I explored the source code and found out that the Radix tree used in IndexedRDD, only returns exact matches. However, it seems to have an restricted use, For example, I want to find children nodes using prefix

Re: Spark IndexedRDD dependency in Maven

2015-11-09 Thread Ted Yu
I would suggest asking this question on SPARK-2365 since IndexedRDD has not been released (upstream) Cheers On Mon, Nov 9, 2015 at 1:34 PM, swetha wrote: > > Hi , > > What is the appropriate dependency to include for Spark Indexed RDD? I get > compilation error if I include 0.3

Spark IndexedRDD dependency in Maven

2015-11-09 Thread swetha
Hi , What is the appropriate dependency to include for Spark Indexed RDD? I get compilation error if I include 0.3 as the version as shown below: amplab spark-indexedrdd 0.3 Thanks, Swetha -- View this message in context: http://apache

INDEXEDRDD in PYSPARK

2015-09-03 Thread shahid ashraf
Hi Folks Any resource to get started using https://github.com/amplab/spark-indexedrdd in pyspark -- with Regards Shahid Ashraf

Re: Is IndexedRDD available in Spark 1.4.0?

2015-07-23 Thread Ruslan Dautkhanov
gt;> data, that is, key-value stores, databases, etc. >> >> On Tue, Jul 14, 2015 at 5:44 PM, Ted Yu wrote: >> >>> Please take a look at SPARK-2365 which is in progress. >>> >>> On Tue, Jul 14, 2015 at 5:18 PM, swetha >>> wrote: >>

Re: Is IndexedRDD available in Spark 1.4.0?

2015-07-14 Thread Ted Yu
d system > that is designed and optimized for long term storage of data, that is, > key-value stores, databases, etc. > > On Tue, Jul 14, 2015 at 5:44 PM, Ted Yu wrote: > >> Please take a look at SPARK-2365 which is in progress. >> >> On Tue, Jul 14, 2015 at 5:18 PM,

Re: Is IndexedRDD available in Spark 1.4.0?

2015-07-14 Thread Tathagata Das
: > Please take a look at SPARK-2365 which is in progress. > > On Tue, Jul 14, 2015 at 5:18 PM, swetha wrote: > >> Hi, >> >> Is IndexedRDD available in Spark 1.4.0? We would like to use this in Spark >> Streaming to do lookups/updates/deletes in RDDs using keys b

Re: Is IndexedRDD available in Spark 1.4.0?

2015-07-14 Thread Ted Yu
Please take a look at SPARK-2365 which is in progress. On Tue, Jul 14, 2015 at 5:18 PM, swetha wrote: > Hi, > > Is IndexedRDD available in Spark 1.4.0? We would like to use this in Spark > Streaming to do lookups/updates/deletes in RDDs using keys by storing them > as

Is IndexedRDD available in Spark 1.4.0?

2015-07-14 Thread swetha
Hi, Is IndexedRDD available in Spark 1.4.0? We would like to use this in Spark Streaming to do lookups/updates/deletes in RDDs using keys by storing them as key/value pairs. Thanks, Swetha -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Is-IndexedRDD

Re: IndexedRDD

2015-01-13 Thread Jem Tucker
any time increase at all until the small table was within 1 order of magnitude of the larger. I agree though, the performance is not bad at all! The same join with normal RDDs takes an order of magnitude longer i found, I can share the results tomorrow. I am unsure exactly how the IndexedRDD are inde

Re: IndexedRDD

2015-01-13 Thread Jerry Lam
Hi guys, I'm interested in the IndexedRDD too. How many rows in the big table that matches the small table in every run? If the number of rows stay constant, then I think Jem wants the runtime to stay about constant (i.e. ~ 0.6 second for all cases). However, I agree with Andrew. The perfor

Re: IndexedRDD

2015-01-13 Thread Andrew Ash
Hi Jem, Linear time in scaling on the big table doesn't seem that surprising to me. What were you expecting? I assume you're doing normalRDD.join(indexedRDD). If you were to replace the indexedRDD with a normal RDD, what times do you get? On Tue, Jan 13, 2015 at 5:35 AM, Jem Tuc

IndexedRDD

2015-01-13 Thread Jem Tucker
Hi, I have been playing around with the indexedRDD ( https://issues.apache.org/jira/browse/SPARK-2365, https://github.com/amplab/spark-indexedrdd) and have been very impressed with its performance. Some performance testing has revealed worse than expected scaling of the join performance*, and I