Re: Creating time-sequential pairs

2014-05-10 Thread Sean Owen
How about ... val data = sc.parallelize(Array((1,0.05),(2,0.10),(3,0.15))) val pairs = data.join(data.map(t => (t._1 + 1, t._2))) It's a self-join, but one copy has its ID incremented by 1. I don't know if it's performant but works, although output is more like: (2,(0.1,0.05)) (3,(0.15,0.1)) On

Creating time-sequential pairs

2014-05-10 Thread Nicholas Pritchard
Hi Spark community, I have a design/algorithm question that I assume is common enough for someone else to have tackled before. I have an RDD of time-series data formatted as time-value tuples, RDD[(Double, Double)], and am trying to extract threshold crossings. In order to do so, I first want to t