Hi Andy, Spark ML/MLlib does not provide a transformer to map HashingTF generated feature back to words currently.
2016-01-01 8:37 GMT+08:00 Hayri Volkan Agun <volkana...@gmail.com>: > Hi, > > If you are using pipeline api, you do not need to map features back to > documents. > Your input (which is the document text) won't change after you used > HashingTF. > If you want to do Information Retrieval with spark, I suggest you to use > not the pipeline but RDDs... > > On Fri, Jan 1, 2016 at 2:20 AM, Andy Davidson < > a...@santacruzintegration.com> wrote: > >> Hi >> >> I am working on proof of concept. I am trying to use spark to classify >> some documents. I am using tokenizer and hashingTF to convert the documents >> into vectors. Is there any easy way to map feature back to words or do I >> need to maintain the reverse index my self? I realize there is a chance >> some words map to same buck >> >> Kind regards >> >> Andy >> >> > > > -- > Hayri Volkan Agun > PhD. Student - Anadolu University >