Getting values with low scores

2009-04-26 Thread samd
I have 2500 documents and need to have a matches with the very lowest rank returned How can I get this? It is very important. When I look at the index in look I see the fields with my values but they all have low rank. When I search they don't show in the results. -- View this message in contex

Re: Using Payloads

2009-04-26 Thread Murat Yakici
See my comments: > Yes, for this specific part, I have this prior knowledge which is based on > a > training set. > About the things you raise here, there are two things you might mean, I am > not sure: > > 1. If you don't have that "prior" knowledge, then all it means you need to > modify the f

Re: lsi as indexing algorithm with lucene

2009-04-26 Thread Dominik Jednoralski
Hi, I'm the guy who has written the bachelor on this. Sorry it took a while to publish it to the community, but I had to improve it before publishing. The topic of the thesis was to augment the Lucene-driven search facility of the Intelligent Tutoring System ActiveMath by latent semantics. Semanti

Re: Using Payloads

2009-04-26 Thread liat oren
Yes, for this specific part, I have this prior knowledge which is based on a training set. About the things you raise here, there are two things you might mean, I am not sure: 1. If you don't have that "prior" knowledge, then all it means you need to modify the formula of the score, no? to give mo

Re: Using Payloads

2009-04-26 Thread Murat Yakici
Yes, this is more or less what I had in mind. However, for this approach one requires some *prior knowledge* of the vocabulary of the document (or the collection) to produce that score before even it gets analyzed, isn't it? And this is the paradox that I have been thinking. If you have that knowl

Re: Using Payloads

2009-04-26 Thread liat oren
Thanks, Murat. It was very useful - I also tried to override IndexWriter and DocumentsWriter instead, but it didn't work well. DocumentsWriter can't be overriden. So, I didn't find a better way to make the changes. My needs are having for every term in different documents different values. So, l