Re: create a SchemaRDD from a custom datasource

2015-01-13 Thread Reynold Xin
If it is a small collection of them on the driver, you can just use sc.parallelize to create an RDD. On Tue, Jan 13, 2015 at 7:56 AM, Malith Dhanushka wrote: > Hi Reynold, > > Thanks for the response. I am just wondering, lets say we have set of Row > objects. Isn't there a straightforward way

Re: create a SchemaRDD from a custom datasource

2015-01-13 Thread Reynold Xin
Depends on what the other side is doing. You can create your own RDD implementation by subclassing RDD, or it might work if you use sc.parallelize(1 to n, n).mapPartitionsWithIndex( /* code to read the data and return an iterator */ ) where n is the number of partitions. On Tue, Jan 13, 2015 at 12