Could you try this?
df.groupBy(cast(col("timeStamp") - start) / bucketLengthSec,
IntegerType)).agg(max("timestamp"), max("value")).collect()
On Wed, Dec 9, 2015 at 8:54 AM, Arun Verma wrote:
> Hi all,
>
> We have RDD(main) of sorted time-series data. We want to split it into
> different RDDs acc
Hi Arun,
A Java API was actually recently added to the library. It will be
available in the next release.
-Sandy
On Thu, Dec 10, 2015 at 12:16 AM, Arun Verma
wrote:
> Thank you for your reply. It is a Scala and Python library. Is similar
> library exists for Java?
>
> On Wed, Dec 9, 2015 at 1
Thank you for your reply. It is a Scala and Python library. Is similar
library exists for Java?
On Wed, Dec 9, 2015 at 10:26 PM, Sean Owen wrote:
> CC Sandy as his https://github.com/cloudera/spark-timeseries might be
> of use here.
>
> On Wed, Dec 9, 2015 at 4:54 PM, Arun Verma
> wrote:
> > Hi
CC Sandy as his https://github.com/cloudera/spark-timeseries might be
of use here.
On Wed, Dec 9, 2015 at 4:54 PM, Arun Verma wrote:
> Hi all,
>
> We have RDD(main) of sorted time-series data. We want to split it into
> different RDDs according to window size and then perform some aggregation
> o