Re: vector model usage

2010-06-08 Thread Dionisis Koumouras
Hi, this example is really close to what I'm trying to do but unfortunately it uses a lot of classes that are outdated in version 2.9.2 that I'm currently using. Actually, it uses text in which boosts follow terms (delimited by a special char), parses the text and then adds documents to the index.

Re: vector model usage

2010-06-01 Thread Rebecca Watson
Hi, if you want to store word+value pairs then use lucene scoring to weight the words with higher vaules against them, you should look at using payloads and the DelimitedPayloadTokenFilter which lets you specify e.g. word1|value1 word2|value2 ... and the values are stored as payloads against the w

Re: vector model usage

2010-06-01 Thread Dionisis Koumouras
Thanks for your reply Grant. I checked out the TokenStream class and you are right but I'm afraid I didn't really make myself understood. What I want is to be able to create a Document out of key-value pairs of terms and float numbers representing word weights, insert the Document in the index and

Re: vector model usage

2010-06-01 Thread Grant Ingersoll
On May 31, 2010, at 6:25 AM, Dionisis Koumouras wrote: > Hi all, > I'm new to lucene but have used it succesfully for a few simple tasks. > > I am experimenting with the vector space representation of documents and > have managed to store and retrieve TermFreqVector objects. > > The question is

vector model usage

2010-05-31 Thread Dionisis Koumouras
Hi all, I'm new to lucene but have used it succesfully for a few simple tasks. I am experimenting with the vector space representation of documents and have managed to store and retrieve TermFreqVector objects. The question is whether it is possible to directly add vector space representations of