I found this : https://spark.apache.org/docs/1.2.0/api/java/org/apache/spark/ml/feature/Tokenizer.html which indicates the Tokenizer did exist in Spark 1.2.0 then and not in 1.2.1?
On Tue, Jun 2, 2015 at 12:45 PM, Peter Rudenko <petro.rude...@gmail.com> wrote: > I'm afraid there's no such class for 1.2.1. This API was added to 1.3.0 > AFAIK. > > > On 2015-06-02 21:40, Dimp Bhat wrote: > > Thanks Peter. Can you share the Tokenizer.java class for Spark 1.2.1. > > Dimple > > On Tue, Jun 2, 2015 at 10:51 AM, Peter Rudenko <petro.rude...@gmail.com> > wrote: > >> Hi Dimple, >> take a look to existing transformers: >> >> https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala >> >> https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/Tokenizer.scala >> >> https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala >> (*it's for spark-1.4) >> >> The idea is just to implement class that extends Transformer with >> HasInputCol with HasOutputCol (if your transformer 1:1 column >> transformer) and has >> >> def transform(dataset: DataFrame): DataFrame >> >> method. >> >> Thanks, >> Peter >> On 2015-06-02 20:19, dimple wrote: >> >> Hi, >> I would like to embed my own transformer in the Spark.ml Pipleline but do >> not see an example of it. Can someone share an example of which >> classes/interfaces I need to extend/implement in order to do so. Thanks. >> >> Dimple >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Embedding-your-own-transformer-in-Spark-ml-Pipleline-tp23112.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >> > >