subject:"\(Spark SQL\) partition\-scoped UDF"

RE: (Spark SQL) partition-scoped UDF

2015-09-09 Thread Eron Wright

spark/dl4j-spark-ml/src/main/scala/org/deeplearning4j/spark/ml/classification/MultiLayerNetworkClassification.scala#L143 Thanks Reynold for your time. -Eron Date: Sat, 5 Sep 2015 13:55:34 -0700 Subject: Re: (Spark SQL) partition-scoped UDF From: ewri...@live.com To: r...@databricks.com CC

Re: (Spark SQL) partition-scoped UDF

2015-09-05 Thread Eron Wright

of a solution compatible with Spark 1.4 or 1.5? Thanks again! From: Reynold Xin Date: Friday, September 4, 2015 at 5:19 PM To: Eron Wright Cc: "dev@spark.apache.org" Subject: Re: (Spark SQL) partition-scoped UDF Can you say more about your transformer? This is a good idea, and ind

Re: (Spark SQL) partition-scoped UDF

2015-09-04 Thread Reynold Xin

Can you say more about your transformer? This is a good idea, and indeed we are doing it for R already (the latest way to run UDFs in R is to pass the entire partition as a local R dataframe for users to run on). However, what works for R for simple data processing might not work for your high per

(Spark SQL) partition-scoped UDF

2015-09-04 Thread Eron Wright

Transformers in Spark ML typically operate on a per-row basis, based on callUDF. For a new transformer that I'm developing, I have a need to transform an entire partition with a function, as opposed to transforming each row separately. The reason is that, in my case, rows must be transformed i