Re: how to implement a proximity search feature using Queries instead of terms

2011-01-16 Thread Livia Hauser
Hi Robert, it looks really good! Many thanks! Regards, Livia -Ursprüngliche Nachricht- Von: "Robert Muir" Gesendet: 16.01.2011 18:50:36 An: java-user@lucene.apache.org Betreff: Re: how to implement a proximity search feature using Queries instead of terms >On Sun, J

Re: Unicode normalisation *before* tokenisation?

2011-01-16 Thread Trejkaz
On Mon, Jan 17, 2011 at 11:53 AM, Robert Muir wrote: > On Sun, Jan 16, 2011 at 7:37 PM, Trejkaz wrote: >> So I guess I have two questions: >>    1. Is there some way to do filtering to the text before >> tokenisation without upsetting the offsets reported by the tokeniser? >>    2. Is there some

Re: Unicode normalisation *before* tokenisation?

2011-01-16 Thread Robert Muir
On Sun, Jan 16, 2011 at 7:37 PM, Trejkaz wrote: > So I guess I have two questions: >    1. Is there some way to do filtering to the text before > tokenisation without upsetting the offsets reported by the tokeniser? >    2. Is there some more general solution to this problem, such as an > existing

Unicode normalisation *before* tokenisation?

2011-01-16 Thread Trejkaz
Hi all. I discovered there is a normalise filter now, using ICU's Normalizer2 (org.apache.lucene.analysis.icu.ICUNormalizer2Filter). However, as this is a filter, various problems can result if used with StandardTokenizer. One in particular is half-width Katakana. Supposing you start out with t

Re: how to implement a proximity search feature using Queries instead of terms

2011-01-16 Thread Robert Muir
On Sun, Jan 16, 2011 at 12:42 PM, Livia Hauser wrote: > Hi All, > > i have my own query parser which generates fuzzy/wildcard queries instances. > It works fantastic, Lucene rocks ;-). > But i have to make sure the words are not to far apart.  I checked current > proximity implementation. What i

how to implement a proximity search feature using Queries instead of terms

2011-01-16 Thread Livia Hauser
Hi All, i have my own query parser which generates fuzzy/wildcard queries instances. It works fantastic, Lucene rocks ;-). But i have to make sure the words are not to far apart.  I checked current proximity implementation. What i found is: PhraseQuery calculates a distance between terms (n

Re: Result ordering

2011-01-16 Thread Umesh Prasad
Hi Pelit, My comments are inline. On Sun, Jan 16, 2011 at 8:03 PM, Pelit Mamani wrote: > Hi, > > I'm maintaining some Lucene-based code, and we're trying to get control > over result ordering (users aren't happy with the default). > I know how to boost a Field or Document (very useful). > But:

Re: Result ordering

2011-01-16 Thread Anshum
Hi Pelit, Firstly, number of words that match a query in a document is not term frequency. You may get some more idea on the terminologies used in search at http://www.miislita.com/term-vector/term-vector-3.html Looking at what you're trying to achieve, a few solutions to you would be. Below, I am

Question on writer optimize() / file merging?

2011-01-16 Thread sol myr
Hi, I'm trying to understand the behavior of file merging / optimization. I see that whenever my IndexWriter calls 'commit()', it creates a new file (or fileS). I also see these files merged when calling 'optimize()' , as much as allowed by the parameter 'NoCFSRatio' . But I'm still trying to f

Result ordering

2011-01-16 Thread Pelit Mamani
Hi, I'm maintaining some Lucene-based code, and we're trying to get control over result ordering (users aren't happy with the default). I know how to boost a Field or Document (very useful). But: 1) Is there a way to boost "OR" queries, based on the number of matched terms? So the OR quer

Re: Newbie: "Life span" of IndexWriter / IndexSearcher?

2011-01-16 Thread sol myr
Worked like a charm - thanks a lot. --- On Sun, 1/16/11, Raf wrote: From: Raf Subject: Re: Newbie: "Life span" of IndexWriter / IndexSearcher? To: java-user@lucene.apache.org Date: Sunday, January 16, 2011, 3:16 AM Look at the JavaDoc: http://lucene.apache.org/java/3_0_2/api/core/org/apache/l

Re: Newbie: "Life span" of IndexWriter / IndexSearcher?

2011-01-16 Thread Raf
Look at the JavaDoc: http://lucene.apache.org/java/3_0_2/api/core/org/apache/lucene/index/IndexReader.html#reopen() The *reopen* method returns a *new reader* if the index has changed since the original reader was opened. So, you should do something like this: IndexReader newReader = reader.reope

RE: Newbie: "Life span" of IndexWriter / IndexSearcher?

2011-01-16 Thread sol myr
Hi, Thank you kindly for replying. Unfortunately, reopen() doesn't help me see the changes. Here's my test: First I write & commit a document, and run a search - which correctly finds this document. Then I write & commit another document, re-open the reader and run another search - this should