Re: does HashingTF maintain a inverse index?

Yanbo Liang Fri, 01 Jan 2016 07:54:06 -0800

Hi Andy,

Spark ML/MLlib does not provide a transformer to map HashingTF generated
feature back to words currently.


2016-01-01 8:37 GMT+08:00 Hayri Volkan Agun <volkana...@gmail.com>:

> Hi,
>
> If you are using pipeline api, you do not need to map features back to
> documents.
> Your input (which is the document text) won't change after you used
> HashingTF.
> If you want to do Information Retrieval with spark, I suggest you to use
> not the pipeline but RDDs...
>
> On Fri, Jan 1, 2016 at 2:20 AM, Andy Davidson <
> a...@santacruzintegration.com> wrote:
>
>> Hi
>>
>> I am working on proof of concept. I am trying to use spark to classify
>> some documents. I am using tokenizer and hashingTF to convert the documents
>> into vectors. Is there any easy way to map feature back to words or do I
>> need to maintain the reverse index my self? I realize there is a chance
>> some words map to same buck
>>
>> Kind regards
>>
>> Andy
>>
>>
>
>
> --
> Hayri Volkan Agun
> PhD. Student - Anadolu University
>

Re: does HashingTF maintain a inverse index?

Reply via email to