I found this :
https://spark.apache.org/docs/1.2.0/api/java/org/apache/spark/ml/feature/Tokenizer.html
which indicates the Tokenizer did exist in Spark 1.2.0 then and not in
1.2.1?

On Tue, Jun 2, 2015 at 12:45 PM, Peter Rudenko <petro.rude...@gmail.com>
wrote:

>  I'm afraid there's no such class for 1.2.1. This API was added to 1.3.0
> AFAIK.
>
>
> On 2015-06-02 21:40, Dimp Bhat wrote:
>
> Thanks Peter. Can you share the Tokenizer.java class for Spark 1.2.1.
>
>  Dimple
>
> On Tue, Jun 2, 2015 at 10:51 AM, Peter Rudenko <petro.rude...@gmail.com>
> wrote:
>
>>  Hi Dimple,
>> take a look to existing transformers:
>>
>> https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala
>>
>> https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/Tokenizer.scala
>>
>> https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala
>> (*it's for spark-1.4)
>>
>> The idea is just to implement class that extends Transformer with
>> HasInputCol with HasOutputCol (if your transformer 1:1 column
>> transformer) and has
>>
>> def transform(dataset: DataFrame): DataFrame
>>
>> method.
>>
>> Thanks,
>> Peter
>> On 2015-06-02 20:19, dimple wrote:
>>
>> Hi,
>> I would like to embed my own transformer in the Spark.ml Pipleline but do
>> not see an example of it. Can someone share an example of which
>> classes/interfaces I need to extend/implement in order to do so. Thanks.
>>
>> Dimple
>>
>>
>>
>> --
>> View this message in context: 
>> http://apache-spark-user-list.1001560.n3.nabble.com/Embedding-your-own-transformer-in-Spark-ml-Pipleline-tp23112.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>>
>
>

Reply via email to