subject:"PySpark RDD with NumpyArray Structure"

RE: PySpark RDD with NumpyArray Structure

2015-12-06 Thread Darren Govoni

Maybe this is helpful https://github.com/lensacom/sparkit-learn/blob/master/README.rst Sent from my Verizon Wireless 4G LTE smartphone Original message From: Mustafa Elbehery Date: 12/06/2015 3:59 PM (GMT-05:00) To: user Subject: PySpark RDD with NumpyArray

PySpark RDD with NumpyArray Structure

2015-12-06 Thread Mustafa Elbehery

Hi All, I would like to parallelize Python NumpyArray to apply scikit Learn algorithm on top of Spark. When I call *sc.parallelize() *I receive rdd of different structure. To be more precise, I am trying to have the following, X = [[ 0.49426097 1.45106697] [-1.42808099 -0.83706377] [ 0.338559