[ https://issues.apache.org/jira/browse/FLINK-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550448#comment-14550448 ]
Till Rohrmann commented on FLINK-1999: -------------------------------------- You probably have to apply a similar trick as the {{FeatureHasher}} to map a word of a dictionary with unknown size to a fixed-length vector. Alternatively, you first calculate the dictionary of all known words and the corresponding mapping. > TF-IDF transformer > ------------------ > > Key: FLINK-1999 > URL: https://issues.apache.org/jira/browse/FLINK-1999 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library > Reporter: Ronny Bräunlich > Assignee: Alexander Alexandrov > Priority: Minor > Labels: ML > > Hello everybody, > we are a group of three students from TU Berlin (I guess we're not the first > group creating an issue) and we want to/have to implement a tf-idf tranformer > for Flink. > Our lecturer Alexander told us that we could get some guidance here and that > you could point us to an old version of a similar tranformer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)