Re: Re: About SpakSQL OR MLlib

2014-09-15 Thread boyingk...@163.com
case class Car(id:String,age:Int,tkm:Int,emissions:Int,date:Date, km:Int, fuel:Int) 1. Create an PairedRDD of (age,Car) tuples (pairedRDD) 2. Create a new function fc //returns the interval lower and upper bound def fc(x:Int, interval:Int) : (Int,Int) = { val floor = x - (x%interval)

Re: About SpakSQL OR MLlib

2014-09-15 Thread Soumya Simanta
case class Car(id:String,age:Int,tkm:Int,emissions:Int,date:Date, km:Int, fuel:Int) 1. Create an PairedRDD of (age,Car) tuples (pairedRDD) 2. Create a new function fc //returns the interval lower and upper bound def fc(x:Int, interval:Int) : (Int,Int) = { val floor = x - (x%interval)

About SpakSQL OR MLlib

2014-09-15 Thread boyingk...@163.com
Hi: I have a dataset ,the struct [id,driverAge,TotalKiloMeter ,Emissions ,date,KiloMeter ,fuel], and the data like this: [1-980,34,221926,9,2005-2-8,123,14] [1-981,49,271321,15,2005-2-8,181,82] [1-982,36,189149,18,2005-2-8,162,51] [1-983,51,232753,5,2005-2-8,106,92] [1-984,56,45338,8,2005-2-8,156,