RE: setPositionIncrement questions

2008-05-16 Thread Chris Hostetter
: I ended up hacking StandardTokenizer::next() to check for $^$^$, and if it : is there then set the current Token PositionIncrement to 500 and resume the from what i remember of your use case, it probably would have been a lot easier to just add each paragraph as a seperate field instance (and

RE: setPositionIncrement questions

2008-05-11 Thread Itamar Syn-Hershko
nt: Sunday, March 30, 2008 8:56 AM To: Lucene Users Subject: Re: setPositionIncrement questions : Breaking proximity data has been discussed several times before, and : concluded that setPositionIncrement is the way to go. In regards of it: : : 1. Where should it be called exactly to create the ga

Re: setPositionIncrement questions

2008-04-01 Thread Erick Erickson
ne will get to x > for > both "b" and "c", meaning this could save me query inflation, or as I > first > suggested, auto-apply synonyms. The only question is, I guess, are there > any > drawbacks for using this? > > Thanks. > > Itamar. > > -Orig

RE: setPositionIncrement questions

2008-03-31 Thread Chris Hostetter
: duplicated them to give the words they contain more weight. So I will not : want to return higher PositionIncrement for each instance of a field, just : those which I'm interested in (title/headers). Can this be done somehow : without injecting a "magic string", as Chris called it? there are mu

RE: setPositionIncrement questions

2008-03-31 Thread Itamar Syn-Hershko
me query inflation, or as I first suggested, auto-apply synonyms. The only question is, I guess, are there any drawbacks for using this? Thanks. Itamar. -Original Message- From: Erick Erickson [mailto:[EMAIL PROTECTED] Sent: Monday, March 31, 2008 4:25 PM To: java-user@lucene.apache.org

Re: setPositionIncrement questions

2008-03-31 Thread Erick Erickson
of this by getting a copy of Luke and examining test indexes you build. To boost exact matches, you have to do some fancy dancing. For instance, you could store the original word with a special token (say $) at the end, and *also* the stemmed version at the same position. Then you have to mangle y

RE: setPositionIncrement questions

2008-03-31 Thread Itamar Syn-Hershko
ginal Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Sunday, March 30, 2008 8:56 AM To: Lucene Users Subject: Re: setPositionIncrement questions : Breaking proximity data has been discussed several times before, and : concluded that setPositionIncrement is the way to go. In

Re: setPositionIncrement questions

2008-03-29 Thread Chris Hostetter
: Breaking proximity data has been discussed several times before, and : concluded that setPositionIncrement is the way to go. In regards of it: : : 1. Where should it be called exactly to create the gap properly? any part of your Analyzer can set the position increment on any token to indicat