[ https://issues.apache.org/jira/browse/FLINK-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540137#comment-14540137 ]
Vassil Dimov commented on FLINK-1999: ------------------------------------- Hello [~aalexandrov], We have two questions about the type of the transformer that you suggested ``Point[K, Seq[String]] => Point[K, SparseVector[Double]]``. As we understood K should be a specific word from the document and the SparseVector should contain the single tf-idf values for each document for that specific word. Additionally, to be sure that we understood it right, we would like to ask if the ``Seq[String]`` is a sequence of documents that we get as input. Summarized: The input for the job should be the documents and a specific word. The output should be the tf-idf values for each document for that word. > TF-IDF transformer > ------------------ > > Key: FLINK-1999 > URL: https://issues.apache.org/jira/browse/FLINK-1999 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library > Reporter: Ronny Bräunlich > Assignee: Alexander Alexandrov > Priority: Minor > Labels: ML > > Hello everybody, > we are a group of three students from TU Berlin (I guess we're not the first > group creating an issue) and we want to/have to implement a tf-idf tranformer > for Flink. > Our lecturer Alexander told us that we could get some guidance here and that > you could point us to an old version of a similar tranformer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)