Thanks Peter. Can you share the Tokenizer.java class for Spark 1.2.1. Dimple
On Tue, Jun 2, 2015 at 10:51 AM, Peter Rudenko <petro.rude...@gmail.com> wrote: > Hi Dimple, > take a look to existing transformers: > > https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala > > https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/Tokenizer.scala > > https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala > (*it's for spark-1.4) > > The idea is just to implement class that extends Transformer with > HasInputCol with HasOutputCol (if your transformer 1:1 column > transformer) and has > > def transform(dataset: DataFrame): DataFrame > > method. > > Thanks, > Peter > On 2015-06-02 20:19, dimple wrote: > > Hi, > I would like to embed my own transformer in the Spark.ml Pipleline but do > not see an example of it. Can someone share an example of which > classes/interfaces I need to extend/implement in order to do so. Thanks. > > Dimple > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Embedding-your-own-transformer-in-Spark-ml-Pipleline-tp23112.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > >