Indexing with weights

2011-01-24 Thread Chris Schilling
Hello, I have a bunch of text documents formatted like so: keyword1 wt1 keyword2 wt2 keyword3 wt3 I would like to index the documents based on the keywords. When I retrieve (search) for a keyword, I would like the list of documents to be sorted by the weight for that keyword. Is there an ex

Re: Indexing with weights

2011-01-24 Thread Chris Schilling
> > Note: the field you sort on should NOT be tokenized. > > Best > Erick > > On Mon, Jan 24, 2011 at 4:02 PM, Chris Schilling wrote: > >> Hello, >> >> I have a bunch of text documents formatted like so: >> >> keyword1 wt1 >> keyw

Re: Indexing with weights

2011-01-24 Thread Chris Schilling
Then just search on keywords and sort on weight. > > Note: the field you sort on should NOT be tokenized. > > Best > Erick > > On Mon, Jan 24, 2011 at 4:02 PM, Chris Schilling wrote: > >> Hello, >> >> I have a bunch of text documents formatted like so:

new to lucene, non standard index

2011-05-05 Thread Chris Schilling
Hi, I am trying to figure out how to solve this problem: I have about 500,000 files that I would like to index, but the files are structured. So, each file has the following layout: doc1 token1, weight11, frequency1, weight21 token2, weight12, frequency2, weight22 . . . etc for 500,000 docs.

Re: new to lucene, non standard index

2011-05-05 Thread Chris Schilling
ht2 or frequency. > > If the token matches are unique within a document you will only get each > document listed once. If they aren't unique, it's not clear what you want to > sort by anyway > > -Mike > > On 05/05/2011 04:12 PM, Chris Schilling wrote: >

Re: new to lucene, non standard index

2011-05-05 Thread Chris Schilling
en matches are unique within a document you will only get each > document listed once. If they aren't unique, it's not clear what you want to > sort by anyway > > -Mike > > On 05/05/2011 04:12 PM, Chris Schilling wrote: >> Hi, >> >> I am try

Re: new to lucene, non standard index

2011-05-05 Thread Chris Schilling
> > but I think you're saying that doesn't happen > > On 05/05/2011 06:09 PM, Chris Schilling wrote: >> Hey Mike, >> >> Let me clarify: >> >> The tokens are not unique. Let's say doc1 contains the token >> foo and has the propertie