Re: [Pyspark 2.3+] Timeseries with Spark

2019-12-29 Thread Masood Krohy
Hi Rishi, Spark and Flint are useful during the data engineering phase, but you'd need to look elsewhere after that. I'm not aware of any active Spark-native project to do ML/forecast on time series data. If the data that you want to train the model on can fit in one node's memory, you can u

Splitting resource in Spark cluster

2019-12-29 Thread Tzahi File
Hi All, I'm using one spark cluster cluster that contains 50 nodes from type i3.4xl (16Vcores). I'm trying to run 4 Spark SQL queries simultaneously. The data is split to 10 even partitions and the 4 queries run on the same data,but different partition. I have tried to configure the cluster so ea

Re: [Pyspark 2.3+] Timeseries with Spark

2019-12-29 Thread Rishi Shah
Hi All, Checking in to see if anyone had input around time series libraries using Spark. I in interested in financial forecasting model & regression mainly at this point. Input is a bunch of pricing data points. I have read a lot of spark-timeseries and flint libraries but I am not sure of the b