[ 
https://issues.apache.org/jira/browse/FLINK-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537143#comment-14537143
 ] 

Felix Neutatz commented on FLINK-1999:
--------------------------------------

Hi Ronny,

I guess the functionality is pretty similar to the [Feature 
Hasher](https://issues.apache.org/jira/browse/FLINK-1735). Your input type will 
be also DataSet[Seq[String]] and your function will be also in the same package 
according to scikit-learn: 
http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html

> TF-IDF transformer
> ------------------
>
>                 Key: FLINK-1999
>                 URL: https://issues.apache.org/jira/browse/FLINK-1999
>             Project: Flink
>          Issue Type: New Feature
>            Reporter: Ronny Bräunlich
>            Priority: Minor
>
> Hello everybody,
> we are a group of three students from TU Berlin (I guess we're not the first 
> group creating an issue) and we want to/have to implement a tf-idf tranformer 
> for Flink.
> Our lecturer Alexander told us that we could get some guidance here and that 
> you could point us to an old version of a similar tranformer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to