Re: vector model usage

2010-06-08 Thread Dionisis Koumouras
Hi, this example is really close to what I'm trying to do but unfortunately it uses a lot of classes that are outdated in version 2.9.2 that I'm currently using. Actually, it uses text in which boosts follow terms (delimited by a special char), parses the text and then adds documents to the index.

Re: vector model usage

2010-06-01 Thread Rebecca Watson
Hi, if you want to store word+value pairs then use lucene scoring to weight the words with higher vaules against them, you should look at using payloads and the DelimitedPayloadTokenFilter which lets you specify e.g. word1|value1 word2|value2 ... and the values are stored as payloads against the w

Re: vector model usage

2010-06-01 Thread Dionisis Koumouras
Thanks for your reply Grant. I checked out the TokenStream class and you are right but I'm afraid I didn't really make myself understood. What I want is to be able to create a Document out of key-value pairs of terms and float numbers representing word weights, insert the Document in the index and

Re: vector model usage

2010-06-01 Thread Grant Ingersoll
On May 31, 2010, at 6:25 AM, Dionisis Koumouras wrote: > Hi all, > I'm new to lucene but have used it succesfully for a few simple tasks. > > I am experimenting with the vector space representation of documents and > have managed to store and retrieve TermFreqVector objects. > > The question is