[ 
https://issues.apache.org/jira/browse/FLINK-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540137#comment-14540137
 ] 

Vassil Dimov commented on FLINK-1999:
-------------------------------------

Hello [~aalexandrov],

We have two questions about the type of the transformer that you suggested 
``Point[K, Seq[String]] => Point[K, SparseVector[Double]]``.

As we understood K should be a specific word from the document and the 
SparseVector should contain the single tf-idf values for each document for that 
specific word.

Additionally, to be sure that we understood it right, we would like to ask if 
the ``Seq[String]`` is a sequence of documents that we get as input.

Summarized:  The input for the job should be the documents and a specific word. 
The output should be the tf-idf values for each document for that word. 

> TF-IDF transformer
> ------------------
>
>                 Key: FLINK-1999
>                 URL: https://issues.apache.org/jira/browse/FLINK-1999
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Ronny Bräunlich
>            Assignee: Alexander Alexandrov
>            Priority: Minor
>              Labels: ML
>
> Hello everybody,
> we are a group of three students from TU Berlin (I guess we're not the first 
> group creating an issue) and we want to/have to implement a tf-idf tranformer 
> for Flink.
> Our lecturer Alexander told us that we could get some guidance here and that 
> you could point us to an old version of a similar tranformer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to