[ 
https://issues.apache.org/jira/browse/FLINK-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550448#comment-14550448
 ] 

Till Rohrmann commented on FLINK-1999:
--------------------------------------

You probably have to apply a similar trick as the {{FeatureHasher}} to map a 
word of a dictionary with unknown size to a fixed-length vector. Alternatively, 
you first calculate the dictionary of all known words and the corresponding 
mapping.

> TF-IDF transformer
> ------------------
>
>                 Key: FLINK-1999
>                 URL: https://issues.apache.org/jira/browse/FLINK-1999
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Ronny Bräunlich
>            Assignee: Alexander Alexandrov
>            Priority: Minor
>              Labels: ML
>
> Hello everybody,
> we are a group of three students from TU Berlin (I guess we're not the first 
> group creating an issue) and we want to/have to implement a tf-idf tranformer 
> for Flink.
> Our lecturer Alexander told us that we could get some guidance here and that 
> you could point us to an old version of a similar tranformer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to