Re: How to safely close Index objects?

2009-11-11 Thread Michael McCandless
On Wed, Nov 11, 2009 at 12:13 AM, Jacob Rhoden  wrote:

> Given a class with two static variables, is the following safe? ie If I call
> "close" while something else is using the objects, do the objects simply
> hold a flag saying they need to be destroyed once the objects are finished
> being used, or do they not track if anything is currently using the object
> and simply blow up if I try to close them?

You shouldn't close your IndexReader/IndexSearcher unless all searches
are done using them, otherwise you'll hit exceptions.

Also, if you don't separately use the IndexReader, you can just
open/close the IndexSearcher.

However, it's better to use IndexReader's reopen method, which only
opens the new segments created since the last time the reader was
opened.

> How does one get the early edition of the lucene book? I couldn't work out
> how from the website. I am assuming I am missing something obvious (:

You can get the book here http://www.manning.com/hatcher3 (NOTE: I'm
one of the authors!).

Chapter 11 in the book has a class called SearcherManager, that
handles the details of reopen/closing the IndexReader while queries
are still in flight, that might be useful here.

Mike

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: IndexWriter.close() no longer seems to close everything

2009-11-11 Thread Albert Juhe

I don't know if it's the same problem but I think it's similar, 

My problem is with the Indexsearcher. I've installed a web search engine
that uses Lucene, after a search I make a close operation like this:

private IndexSearcher searcher;

NIOFSDirectory directory = new NIOFSDirectory(new File(path));
this.searcher = new IndexSearcher(directory, true);

 public void close() {
try {
System.out.println("Closing: "
+this.searcher.getIndexReader().directory().toString() );
this.searcher.getIndexReader().directory().close();
this.searcher.getIndexReader().close();
this.searcher.close();
} catch (IOException e) {
System.out.println(" caught a " + e.getClass() + "\n with
message: " + e.getMessage());
}
}
jboss don't close *.cfs index files, everytime that I make a new search a
new file is caught by jboss. 
If I make 5 searchs jboss caught 5 *.cfs, I see it with lsof command, the
only way to free this files and close it is stopping jboss.

The problem is that after search many times,  jboss go down and I have to
restart it.

Do you have the same problem with IndexSearch or is only with IndexWriter?  

Albert Juhe
Learning Technologies 
Universitat Oberta de Catalunya


Michael McCandless-2 wrote:
> 
> Does this look like a real leak John?  You're definitely closing every
> reader you get back from getReader?
> 
> Mike
> 
> On Sun, Nov 8, 2009 at 10:41 PM, John Wang  wrote:
>> I am seeing the samething, but only when IndexWriter.getReader is called
>> at
>> a high rate.
>>
>> from lsof, I see file handles growing.
>>
>> -John
>>
>> On Sun, Nov 8, 2009 at 7:29 PM, Daniel Noll  wrote:
>>
>>> Hi all.
>>>
>>> We updated to Lucene 2.9, and now we find that after closing our text
>>> index, it is not possible to rename the directory in which it resides
>>> (we are actually renaming a directory further up the hierarchy.)
>>>
>>> We discovered that the following files were still open by the process:
>>>
>>>  _0.tis, _0.frq, _0.prx, _0.fdt, _0.fdx, _0.tvx, _0.tvd, _0.tvf
>>>
>>> We are calling IndexWriter.close() shortly before attempting to write
>>> to the directory (a few lines of code earlier) so I suspect it could
>>> be related to timing somehow if Lucene is perhaps still doing
>>> something on a background thread at this time (though I was under the
>>> impression that close() waited for merges and so forth to complete
>>> before returning.)
>>>
>>> Daniel
>>>
>>> --
>>> Daniel Noll                            Forensic and eDiscovery Software
>>> Senior Developer                              The world's most advanced
>>> Nuix                                                email data analysis
>>> http://nuix.com/                                and eDiscovery software
>>>
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>>
>>
> 
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/IndexWriter.close%28%29-no-longer-seems-to-close-everything-tp26260801p26298910.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: IndexWriter.close() no longer seems to close everything

2009-11-11 Thread Michael McCandless
Do you see your exception handler printing anything out?

You don't need to close the underlying IndexReader, just the
IndexSearcher (which will close the IndexReader, since it was the one
that had opened it).

Mike

On Wed, Nov 11, 2009 at 5:10 AM, Albert Juhe  wrote:
>
> I don't know if it's the same problem but I think it's similar,
>
> My problem is with the Indexsearcher. I've installed a web search engine
> that uses Lucene, after a search I make a close operation like this:
>
> private IndexSearcher searcher;
>
> NIOFSDirectory directory = new NIOFSDirectory(new File(path));
> this.searcher = new IndexSearcher(directory, true);
>
>  public void close() {
>        try {
>            System.out.println("Closing: "
> +this.searcher.getIndexReader().directory().toString() );
>            this.searcher.getIndexReader().directory().close();
>            this.searcher.getIndexReader().close();
>            this.searcher.close();
>        } catch (IOException e) {
>            System.out.println(" caught a " + e.getClass() + "\n with
> message: " + e.getMessage());
>        }
>    }
> jboss don't close *.cfs index files, everytime that I make a new search a
> new file is caught by jboss.
> If I make 5 searchs jboss caught 5 *.cfs, I see it with lsof command, the
> only way to free this files and close it is stopping jboss.
>
> The problem is that after search many times,  jboss go down and I have to
> restart it.
>
> Do you have the same problem with IndexSearch or is only with IndexWriter?
>
> Albert Juhe
> Learning Technologies
> Universitat Oberta de Catalunya
>
>
> Michael McCandless-2 wrote:
>>
>> Does this look like a real leak John?  You're definitely closing every
>> reader you get back from getReader?
>>
>> Mike
>>
>> On Sun, Nov 8, 2009 at 10:41 PM, John Wang  wrote:
>>> I am seeing the samething, but only when IndexWriter.getReader is called
>>> at
>>> a high rate.
>>>
>>> from lsof, I see file handles growing.
>>>
>>> -John
>>>
>>> On Sun, Nov 8, 2009 at 7:29 PM, Daniel Noll  wrote:
>>>
 Hi all.

 We updated to Lucene 2.9, and now we find that after closing our text
 index, it is not possible to rename the directory in which it resides
 (we are actually renaming a directory further up the hierarchy.)

 We discovered that the following files were still open by the process:

  _0.tis, _0.frq, _0.prx, _0.fdt, _0.fdx, _0.tvx, _0.tvd, _0.tvf

 We are calling IndexWriter.close() shortly before attempting to write
 to the directory (a few lines of code earlier) so I suspect it could
 be related to timing somehow if Lucene is perhaps still doing
 something on a background thread at this time (though I was under the
 impression that close() waited for merges and so forth to complete
 before returning.)

 Daniel

 --
 Daniel Noll                            Forensic and eDiscovery Software
 Senior Developer                              The world's most advanced
 Nuix                                                email data analysis
 http://nuix.com/                                and eDiscovery software

 -
 To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-user-h...@lucene.apache.org


>>>
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>>
>
> --
> View this message in context: 
> http://old.nabble.com/IndexWriter.close%28%29-no-longer-seems-to-close-everything-tp26260801p26298910.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: IndexWriter.close() no longer seems to close everything

2009-11-11 Thread Albert Juhe

I don't get any exception.

thank you Mike


Michael McCandless-2 wrote:
> 
> Do you see your exception handler printing anything out?
> 
> You don't need to close the underlying IndexReader, just the
> IndexSearcher (which will close the IndexReader, since it was the one
> that had opened it).
> 
> Mike
> 
> On Wed, Nov 11, 2009 at 5:10 AM, Albert Juhe  wrote:
>>
>> I don't know if it's the same problem but I think it's similar,
>>
>> My problem is with the Indexsearcher. I've installed a web search engine
>> that uses Lucene, after a search I make a close operation like this:
>>
>> private IndexSearcher searcher;
>>
>> NIOFSDirectory directory = new NIOFSDirectory(new File(path));
>> this.searcher = new IndexSearcher(directory, true);
>>
>>  public void close() {
>>        try {
>>            System.out.println("Closing: "
>> +this.searcher.getIndexReader().directory().toString() );
>>            this.searcher.getIndexReader().directory().close();
>>            this.searcher.getIndexReader().close();
>>            this.searcher.close();
>>        } catch (IOException e) {
>>            System.out.println(" caught a " + e.getClass() + "\n with
>> message: " + e.getMessage());
>>        }
>>    }
>> jboss don't close *.cfs index files, everytime that I make a new search a
>> new file is caught by jboss.
>> If I make 5 searchs jboss caught 5 *.cfs, I see it with lsof command, the
>> only way to free this files and close it is stopping jboss.
>>
>> The problem is that after search many times,  jboss go down and I have to
>> restart it.
>>
>> Do you have the same problem with IndexSearch or is only with
>> IndexWriter?
>>
>> Albert Juhe
>> Learning Technologies
>> Universitat Oberta de Catalunya
>>
>>
>> Michael McCandless-2 wrote:
>>>
>>> Does this look like a real leak John?  You're definitely closing every
>>> reader you get back from getReader?
>>>
>>> Mike
>>>
>>> On Sun, Nov 8, 2009 at 10:41 PM, John Wang  wrote:
 I am seeing the samething, but only when IndexWriter.getReader is
 called
 at
 a high rate.

 from lsof, I see file handles growing.

 -John

 On Sun, Nov 8, 2009 at 7:29 PM, Daniel Noll  wrote:

> Hi all.
>
> We updated to Lucene 2.9, and now we find that after closing our text
> index, it is not possible to rename the directory in which it resides
> (we are actually renaming a directory further up the hierarchy.)
>
> We discovered that the following files were still open by the process:
>
>  _0.tis, _0.frq, _0.prx, _0.fdt, _0.fdx, _0.tvx, _0.tvd, _0.tvf
>
> We are calling IndexWriter.close() shortly before attempting to write
> to the directory (a few lines of code earlier) so I suspect it could
> be related to timing somehow if Lucene is perhaps still doing
> something on a background thread at this time (though I was under the
> impression that close() waited for merges and so forth to complete
> before returning.)
>
> Daniel
>
> --
> Daniel Noll                            Forensic and eDiscovery
> Software
> Senior Developer                              The world's most
> advanced
> Nuix                                                email data
> analysis
> http://nuix.com/                                and eDiscovery
> software
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

>>>
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/IndexWriter.close%28%29-no-longer-seems-to-close-everything-tp26260801p26298910.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
> 
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/IndexWriter.close%28%29-no-longer-seems-to-close-everything-tp26260801p26299405.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: IndexWriter.close() no longer seems to close everything

2009-11-11 Thread Michael McCandless
Can you narrow the leak down to a small self-contained test?

Mike

On Wed, Nov 11, 2009 at 5:54 AM, Albert Juhe  wrote:
>
> I don't get any exception.
>
> thank you Mike
>
>
> Michael McCandless-2 wrote:
>>
>> Do you see your exception handler printing anything out?
>>
>> You don't need to close the underlying IndexReader, just the
>> IndexSearcher (which will close the IndexReader, since it was the one
>> that had opened it).
>>
>> Mike
>>
>> On Wed, Nov 11, 2009 at 5:10 AM, Albert Juhe  wrote:
>>>
>>> I don't know if it's the same problem but I think it's similar,
>>>
>>> My problem is with the Indexsearcher. I've installed a web search engine
>>> that uses Lucene, after a search I make a close operation like this:
>>>
>>> private IndexSearcher searcher;
>>>
>>> NIOFSDirectory directory = new NIOFSDirectory(new File(path));
>>> this.searcher = new IndexSearcher(directory, true);
>>>
>>>  public void close() {
>>>        try {
>>>            System.out.println("Closing: "
>>> +this.searcher.getIndexReader().directory().toString() );
>>>            this.searcher.getIndexReader().directory().close();
>>>            this.searcher.getIndexReader().close();
>>>            this.searcher.close();
>>>        } catch (IOException e) {
>>>            System.out.println(" caught a " + e.getClass() + "\n with
>>> message: " + e.getMessage());
>>>        }
>>>    }
>>> jboss don't close *.cfs index files, everytime that I make a new search a
>>> new file is caught by jboss.
>>> If I make 5 searchs jboss caught 5 *.cfs, I see it with lsof command, the
>>> only way to free this files and close it is stopping jboss.
>>>
>>> The problem is that after search many times,  jboss go down and I have to
>>> restart it.
>>>
>>> Do you have the same problem with IndexSearch or is only with
>>> IndexWriter?
>>>
>>> Albert Juhe
>>> Learning Technologies
>>> Universitat Oberta de Catalunya
>>>
>>>
>>> Michael McCandless-2 wrote:

 Does this look like a real leak John?  You're definitely closing every
 reader you get back from getReader?

 Mike

 On Sun, Nov 8, 2009 at 10:41 PM, John Wang  wrote:
> I am seeing the samething, but only when IndexWriter.getReader is
> called
> at
> a high rate.
>
> from lsof, I see file handles growing.
>
> -John
>
> On Sun, Nov 8, 2009 at 7:29 PM, Daniel Noll  wrote:
>
>> Hi all.
>>
>> We updated to Lucene 2.9, and now we find that after closing our text
>> index, it is not possible to rename the directory in which it resides
>> (we are actually renaming a directory further up the hierarchy.)
>>
>> We discovered that the following files were still open by the process:
>>
>>  _0.tis, _0.frq, _0.prx, _0.fdt, _0.fdx, _0.tvx, _0.tvd, _0.tvf
>>
>> We are calling IndexWriter.close() shortly before attempting to write
>> to the directory (a few lines of code earlier) so I suspect it could
>> be related to timing somehow if Lucene is perhaps still doing
>> something on a background thread at this time (though I was under the
>> impression that close() waited for merges and so forth to complete
>> before returning.)
>>
>> Daniel
>>
>> --
>> Daniel Noll                            Forensic and eDiscovery
>> Software
>> Senior Developer                              The world's most
>> advanced
>> Nuix                                                email data
>> analysis
>> http://nuix.com/                                and eDiscovery
>> software
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>

 -
 To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-user-h...@lucene.apache.org



>>>
>>> --
>>> View this message in context:
>>> http://old.nabble.com/IndexWriter.close%28%29-no-longer-seems-to-close-everything-tp26260801p26298910.html
>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>>
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>>
>
> --
> View this message in context: 
> http://old.nabble.com/IndexWriter.close%28%29-no-longer-seems-to-close-everything-tp26260801p26299405.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> 

Re: IndexWriter.close() no longer seems to close everything

2009-11-11 Thread Albert Juhe

I think that this is the best way to proceed.

thank you Mike



Michael McCandless-2 wrote:
> 
> Can you narrow the leak down to a small self-contained test?
> 
> Mike
> 
> On Wed, Nov 11, 2009 at 5:54 AM, Albert Juhe  wrote:
>>
>> I don't get any exception.
>>
>> thank you Mike
>>
>>
>> Michael McCandless-2 wrote:
>>>
>>> Do you see your exception handler printing anything out?
>>>
>>> You don't need to close the underlying IndexReader, just the
>>> IndexSearcher (which will close the IndexReader, since it was the one
>>> that had opened it).
>>>
>>> Mike
>>>
>>> On Wed, Nov 11, 2009 at 5:10 AM, Albert Juhe 
>>> wrote:

 I don't know if it's the same problem but I think it's similar,

 My problem is with the Indexsearcher. I've installed a web search
 engine
 that uses Lucene, after a search I make a close operation like this:

 private IndexSearcher searcher;

 NIOFSDirectory directory = new NIOFSDirectory(new File(path));
 this.searcher = new IndexSearcher(directory, true);

  public void close() {
        try {
            System.out.println("Closing: "
 +this.searcher.getIndexReader().directory().toString() );
            this.searcher.getIndexReader().directory().close();
            this.searcher.getIndexReader().close();
            this.searcher.close();
        } catch (IOException e) {
            System.out.println(" caught a " + e.getClass() + "\n with
 message: " + e.getMessage());
        }
    }
 jboss don't close *.cfs index files, everytime that I make a new search
 a
 new file is caught by jboss.
 If I make 5 searchs jboss caught 5 *.cfs, I see it with lsof command,
 the
 only way to free this files and close it is stopping jboss.

 The problem is that after search many times,  jboss go down and I have
 to
 restart it.

 Do you have the same problem with IndexSearch or is only with
 IndexWriter?

 Albert Juhe
 Learning Technologies
 Universitat Oberta de Catalunya


 Michael McCandless-2 wrote:
>
> Does this look like a real leak John?  You're definitely closing every
> reader you get back from getReader?
>
> Mike
>
> On Sun, Nov 8, 2009 at 10:41 PM, John Wang 
> wrote:
>> I am seeing the samething, but only when IndexWriter.getReader is
>> called
>> at
>> a high rate.
>>
>> from lsof, I see file handles growing.
>>
>> -John
>>
>> On Sun, Nov 8, 2009 at 7:29 PM, Daniel Noll  wrote:
>>
>>> Hi all.
>>>
>>> We updated to Lucene 2.9, and now we find that after closing our
>>> text
>>> index, it is not possible to rename the directory in which it
>>> resides
>>> (we are actually renaming a directory further up the hierarchy.)
>>>
>>> We discovered that the following files were still open by the
>>> process:
>>>
>>>  _0.tis, _0.frq, _0.prx, _0.fdt, _0.fdx, _0.tvx, _0.tvd, _0.tvf
>>>
>>> We are calling IndexWriter.close() shortly before attempting to
>>> write
>>> to the directory (a few lines of code earlier) so I suspect it could
>>> be related to timing somehow if Lucene is perhaps still doing
>>> something on a background thread at this time (though I was under
>>> the
>>> impression that close() waited for merges and so forth to complete
>>> before returning.)
>>>
>>> Daniel
>>>
>>> --
>>> Daniel Noll                            Forensic and eDiscovery
>>> Software
>>> Senior Developer                              The world's most
>>> advanced
>>> Nuix                                                email data
>>> analysis
>>> http://nuix.com/                                and eDiscovery
>>> software
>>>
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>>
>>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
>

 --
 View this message in context:
 http://old.nabble.com/IndexWriter.close%28%29-no-longer-seems-to-close-everything-tp26260801p26298910.html
 Sent from the Lucene - Java Users mailing list archive at Nabble.com.


 -
 To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-user-h...@lucene.apache.org


>>>
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional c

Equality Numeric Query

2009-11-11 Thread Shai Erera
Hi

I index documents with numeric fields using the new Numeric package. I
execute two types of queries: range queries (for example, [1 TO 20}) and
equality queries (for example 24.75). Don't mind the syntax.

Currently, to execute the equality query, I create a NumericRangeQuery with
the lower/upper value being 24.75 and both limits are set to inclusive. Two
questions:
1) Is there a better approach? For example, if I had indexed the values as
separate terms, I could create a TermQuery.
2) Can I run into precision issues such that 24.751 will be matched as well?

Shai


Re: Equality Numeric Query

2009-11-11 Thread Yonik Seeley
On Wed, Nov 11, 2009 at 8:54 AM, Shai Erera  wrote:
> I index documents with numeric fields using the new Numeric package. I
> execute two types of queries: range queries (for example, [1 TO 20}) and
> equality queries (for example 24.75). Don't mind the syntax.
>
> Currently, to execute the equality query, I create a NumericRangeQuery with
> the lower/upper value being 24.75 and both limits are set to inclusive. Two
> questions:
> 1) Is there a better approach? For example, if I had indexed the values as
> separate terms, I could create a TermQuery.

Create a term query on NumericUtils.floatToPrefixCoded(24.75f)

> 2) Can I run into precision issues such that 24.751 will be matched as well?

Nope... every numeric indexed value has it's precision indexed along
with it as a prefix, so there will be no false matches.

-Yonik
http://www.lucidimagination.com

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: Equality Numeric Query

2009-11-11 Thread Uwe Schindler
Hi Shai,

In 2.9.1, the approach using upper/lower bound identical and included is the
official supported usage. The Query is optimized to rewrite efficient in
this case (constant score term query).

But you can also use a TermQuery like Yonik suggested and converting the
numbers yourself.

You will never hit any false terms, as the encoding clearly differentiate
between precisions.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

> -Original Message-
> From: Shai Erera [mailto:ser...@gmail.com]
> Sent: Wednesday, November 11, 2009 2:55 PM
> To: java-user@lucene.apache.org
> Subject: Equality Numeric Query
> 
> Hi
> 
> I index documents with numeric fields using the new Numeric package. I
> execute two types of queries: range queries (for example, [1 TO 20}) and
> equality queries (for example 24.75). Don't mind the syntax.
> 
> Currently, to execute the equality query, I create a NumericRangeQuery
> with
> the lower/upper value being 24.75 and both limits are set to inclusive.
> Two
> questions:
> 1) Is there a better approach? For example, if I had indexed the values as
> separate terms, I could create a TermQuery.
> 2) Can I run into precision issues such that 24.751 will be matched as
> well?
> 
> Shai


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Equality Numeric Query

2009-11-11 Thread Shai Erera
Thanks !

If I use Yonik's approach, do I need to index the terms in a special way?

Shai

On Wed, Nov 11, 2009 at 4:13 PM, Uwe Schindler  wrote:

> Hi Shai,
>
> In 2.9.1, the approach using upper/lower bound identical and included is
> the
> official supported usage. The Query is optimized to rewrite efficient in
> this case (constant score term query).
>
> But you can also use a TermQuery like Yonik suggested and converting the
> numbers yourself.
>
> You will never hit any false terms, as the encoding clearly differentiate
> between precisions.
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
> > -Original Message-
> > From: Shai Erera [mailto:ser...@gmail.com]
> > Sent: Wednesday, November 11, 2009 2:55 PM
> > To: java-user@lucene.apache.org
> > Subject: Equality Numeric Query
> >
> > Hi
> >
> > I index documents with numeric fields using the new Numeric package. I
> > execute two types of queries: range queries (for example, [1 TO 20}) and
> > equality queries (for example 24.75). Don't mind the syntax.
> >
> > Currently, to execute the equality query, I create a NumericRangeQuery
> > with
> > the lower/upper value being 24.75 and both limits are set to inclusive.
> > Two
> > questions:
> > 1) Is there a better approach? For example, if I had indexed the values
> as
> > separate terms, I could create a TermQuery.
> > 2) Can I run into precision issues such that 24.751 will be matched as
> > well?
> >
> > Shai
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


RE: Equality Numeric Query

2009-11-11 Thread Uwe Schindler
No.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Shai Erera [mailto:ser...@gmail.com]
> Sent: Wednesday, November 11, 2009 3:17 PM
> To: java-user@lucene.apache.org
> Subject: Re: Equality Numeric Query
> 
> Thanks !
> 
> If I use Yonik's approach, do I need to index the terms in a special way?
> 
> Shai
> 
> On Wed, Nov 11, 2009 at 4:13 PM, Uwe Schindler  wrote:
> 
> > Hi Shai,
> >
> > In 2.9.1, the approach using upper/lower bound identical and included is
> > the
> > official supported usage. The Query is optimized to rewrite efficient in
> > this case (constant score term query).
> >
> > But you can also use a TermQuery like Yonik suggested and converting the
> > numbers yourself.
> >
> > You will never hit any false terms, as the encoding clearly
> differentiate
> > between precisions.
> >
> > -
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> > > -Original Message-
> > > From: Shai Erera [mailto:ser...@gmail.com]
> > > Sent: Wednesday, November 11, 2009 2:55 PM
> > > To: java-user@lucene.apache.org
> > > Subject: Equality Numeric Query
> > >
> > > Hi
> > >
> > > I index documents with numeric fields using the new Numeric package. I
> > > execute two types of queries: range queries (for example, [1 TO 20})
> and
> > > equality queries (for example 24.75). Don't mind the syntax.
> > >
> > > Currently, to execute the equality query, I create a NumericRangeQuery
> > > with
> > > the lower/upper value being 24.75 and both limits are set to
> inclusive.
> > > Two
> > > questions:
> > > 1) Is there a better approach? For example, if I had indexed the
> values
> > as
> > > separate terms, I could create a TermQuery.
> > > 2) Can I run into precision issues such that 24.751 will be matched as
> > > well?
> > >
> > > Shai
> >
> >
> > -
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Equality Numeric Query

2009-11-11 Thread Shai Erera
Thanks a lot for the super fast response !

Shai

On Wed, Nov 11, 2009 at 4:21 PM, Uwe Schindler  wrote:

> No.
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -Original Message-
> > From: Shai Erera [mailto:ser...@gmail.com]
> > Sent: Wednesday, November 11, 2009 3:17 PM
> > To: java-user@lucene.apache.org
> > Subject: Re: Equality Numeric Query
> >
> > Thanks !
> >
> > If I use Yonik's approach, do I need to index the terms in a special way?
> >
> > Shai
> >
> > On Wed, Nov 11, 2009 at 4:13 PM, Uwe Schindler  wrote:
> >
> > > Hi Shai,
> > >
> > > In 2.9.1, the approach using upper/lower bound identical and included
> is
> > > the
> > > official supported usage. The Query is optimized to rewrite efficient
> in
> > > this case (constant score term query).
> > >
> > > But you can also use a TermQuery like Yonik suggested and converting
> the
> > > numbers yourself.
> > >
> > > You will never hit any false terms, as the encoding clearly
> > differentiate
> > > between precisions.
> > >
> > > -
> > > Uwe Schindler
> > > H.-H.-Meier-Allee 63, D-28213 Bremen
> > > http://www.thetaphi.de
> > > eMail: u...@thetaphi.de
> > >
> > > > -Original Message-
> > > > From: Shai Erera [mailto:ser...@gmail.com]
> > > > Sent: Wednesday, November 11, 2009 2:55 PM
> > > > To: java-user@lucene.apache.org
> > > > Subject: Equality Numeric Query
> > > >
> > > > Hi
> > > >
> > > > I index documents with numeric fields using the new Numeric package.
> I
> > > > execute two types of queries: range queries (for example, [1 TO 20})
> > and
> > > > equality queries (for example 24.75). Don't mind the syntax.
> > > >
> > > > Currently, to execute the equality query, I create a
> NumericRangeQuery
> > > > with
> > > > the lower/upper value being 24.75 and both limits are set to
> > inclusive.
> > > > Two
> > > > questions:
> > > > 1) Is there a better approach? For example, if I had indexed the
> > values
> > > as
> > > > separate terms, I could create a TermQuery.
> > > > 2) Can I run into precision issues such that 24.751 will be matched
> as
> > > > well?
> > > >
> > > > Shai
> > >
> > >
> > > -
> > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > > For additional commands, e-mail: java-user-h...@lucene.apache.org
> > >
> > >
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


RE: Equality Numeric Query

2009-11-11 Thread Uwe Schindler
Thanks!

I would still suggest to use a NumericRangeQuery with upper/lower identical
and inclusive, because it would not use expert APIs (NumericUtils is such)
and is more comfortable to use. As long as you not change the rewrite
method, there is no speed difference, as it rewrites to a simple constant
score term query (rewrites to a single term term enum -> creates
BooleanQuery with one TermQuery -> this rewrites to the TermQuery -> wraps
in ConstantScore).

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Shai Erera [mailto:ser...@gmail.com]
> Sent: Wednesday, November 11, 2009 3:26 PM
> To: java-user@lucene.apache.org
> Subject: Re: Equality Numeric Query
> 
> Thanks a lot for the super fast response !
> 
> Shai
> 
> On Wed, Nov 11, 2009 at 4:21 PM, Uwe Schindler  wrote:
> 
> > No.
> >
> > -
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> >
> > > -Original Message-
> > > From: Shai Erera [mailto:ser...@gmail.com]
> > > Sent: Wednesday, November 11, 2009 3:17 PM
> > > To: java-user@lucene.apache.org
> > > Subject: Re: Equality Numeric Query
> > >
> > > Thanks !
> > >
> > > If I use Yonik's approach, do I need to index the terms in a special
> way?
> > >
> > > Shai
> > >
> > > On Wed, Nov 11, 2009 at 4:13 PM, Uwe Schindler 
> wrote:
> > >
> > > > Hi Shai,
> > > >
> > > > In 2.9.1, the approach using upper/lower bound identical and
> included
> > is
> > > > the
> > > > official supported usage. The Query is optimized to rewrite
> efficient
> > in
> > > > this case (constant score term query).
> > > >
> > > > But you can also use a TermQuery like Yonik suggested and converting
> > the
> > > > numbers yourself.
> > > >
> > > > You will never hit any false terms, as the encoding clearly
> > > differentiate
> > > > between precisions.
> > > >
> > > > -
> > > > Uwe Schindler
> > > > H.-H.-Meier-Allee 63, D-28213 Bremen
> > > > http://www.thetaphi.de
> > > > eMail: u...@thetaphi.de
> > > >
> > > > > -Original Message-
> > > > > From: Shai Erera [mailto:ser...@gmail.com]
> > > > > Sent: Wednesday, November 11, 2009 2:55 PM
> > > > > To: java-user@lucene.apache.org
> > > > > Subject: Equality Numeric Query
> > > > >
> > > > > Hi
> > > > >
> > > > > I index documents with numeric fields using the new Numeric
> package.
> > I
> > > > > execute two types of queries: range queries (for example, [1 TO
> 20})
> > > and
> > > > > equality queries (for example 24.75). Don't mind the syntax.
> > > > >
> > > > > Currently, to execute the equality query, I create a
> > NumericRangeQuery
> > > > > with
> > > > > the lower/upper value being 24.75 and both limits are set to
> > > inclusive.
> > > > > Two
> > > > > questions:
> > > > > 1) Is there a better approach? For example, if I had indexed the
> > > values
> > > > as
> > > > > separate terms, I could create a TermQuery.
> > > > > 2) Can I run into precision issues such that 24.751 will be
> matched
> > as
> > > > > well?
> > > > >
> > > > > Shai
> > > >
> > > >
> > > > 
> -
> > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > > > For additional commands, e-mail: java-user-h...@lucene.apache.org
> > > >
> > > >
> >
> >
> > -
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Wrapping IndexSearcher so that it is safe?

2009-11-11 Thread Jacob Rhoden
I am pondering a way to allow closing of an index searcher and  
releasing the pointer to it so that it automatically cleans up by  
itself when all threads stop using the index searcher. Inspired by the  
Objective C retain/release model, what do you think about this?


Basically when threads start a search, they increment the "retain"  
count, when threads leave the searcher, they decrement the "retain"  
count, and close the searcher if requested. I am attracted to this  
solution as it seems to simplify things greatly unless I have  
overlooked something.



public class SafeIndexSearcher {

private boolean finish = false;
private int retainCount = 0;
private IndexSearcher searcher;

public SafeIndexSearcher(IndexSearcher searcher) {
this.searcher = searcher;
}

public TopDocs search(Query query, int limit) throws IOException {
this.retain();

try {
TopDocs result = searcher.search(query, limit);
this.release();
return result;
} catch (IOException e) {
this.release();
throw e;
}

}

public synchronized void close() {
finish = true;
}

private synchronized void retain() throws IOException {
if(finish)
			throw new IOException("SafeIndexSearcher used after close has been  
called.");

retainCount++;
}

private synchronized void release() {
retainCount--;
if(finish && retainCount==0)
try {
searcher.close();
} catch (IOException e) {
System.err.println("IndexSearcher.close() unexpected error: " +  
e.getMessage());

}
}

}

Thanks!
Jacob


Information Technology Services,
The University of Melbourne

Email: jrho...@unimelb.edu.au
Phone: +61 3 8344 2884
Mobile: +61 4 1095 7575



RE: Wrapping IndexSearcher so that it is safe?

2009-11-11 Thread Uwe Schindler
Looks good. About your code: The searcher will not close if any other
unchecked exception is thrown. Such code should always use finally blocks.
So simply do not catch and rethrow the IOException, instead put release in a
finally block and let the IOException automatically go upwards.


>   this.retain();
>   try {
>   TopDocs result = searcher.search(query, limit);
>   return result;
>   } finally {
>   this.release();
>   }

Less code more secure and effective :-)

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Jacob Rhoden [mailto:jrho...@unimelb.edu.au]
> Sent: Wednesday, November 11, 2009 11:12 PM
> To: java-user@lucene.apache.org
> Subject: Wrapping IndexSearcher so that it is safe?
> 
> I am pondering a way to allow closing of an index searcher and
> releasing the pointer to it so that it automatically cleans up by
> itself when all threads stop using the index searcher. Inspired by the
> Objective C retain/release model, what do you think about this?
> 
> Basically when threads start a search, they increment the "retain"
> count, when threads leave the searcher, they decrement the "retain"
> count, and close the searcher if requested. I am attracted to this
> solution as it seems to simplify things greatly unless I have
> overlooked something.
> 
> 
> public class SafeIndexSearcher {
> 
>   private boolean finish = false;
>   private int retainCount = 0;
>   private IndexSearcher searcher;
> 
>   public SafeIndexSearcher(IndexSearcher searcher) {
>   this.searcher = searcher;
>   }
> 
>   public TopDocs search(Query query, int limit) throws IOException {
>   this.retain();
> 
>   try {
>   TopDocs result = searcher.search(query, limit);
>   this.release();
>   return result;
>   } catch (IOException e) {
>   this.release();
>   throw e;
>   }
> 
>   }
> 
>   public synchronized void close() {
>   finish = true;
>   }
> 
>   private synchronized void retain() throws IOException {
>   if(finish)
>   throw new IOException("SafeIndexSearcher used after
> close has been
> called.");
>   retainCount++;
>   }
> 
>   private synchronized void release() {
>   retainCount--;
>   if(finish && retainCount==0)
>   try {
>   searcher.close();
>   } catch (IOException e) {
>   System.err.println("IndexSearcher.close()
> unexpected error: " +
> e.getMessage());
>   }
>   }
> 
> }
> 
> Thanks!
> Jacob
> 
> 
> Information Technology Services,
> The University of Melbourne
> 
> Email: jrho...@unimelb.edu.au
> Phone: +61 3 8344 2884
> Mobile: +61 4 1095 7575



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Wrapping IndexSearcher so that it is safe?

2009-11-11 Thread Jacob Rhoden

I knew I would have overlooked something, thanks for the help!

On 12/11/2009, at 9:21 AM, Uwe Schindler wrote:

simply do not catch and rethrow the IOException, instead put  
release in a

finally block and let the IOException automatically go upwards.


this.retain();
try {
TopDocs result = searcher.search(query, limit);
return result;
} finally {
this.release();
}


Less code more secure and effective :-)




Information Technology Services,
The University of Melbourne

Email: jrho...@unimelb.edu.au
Phone: +61 3 8344 2884
Mobile: +61 4 1095 7575


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Edit distance and wildcard searching with PhraseQuery

2009-11-11 Thread Jeff Plater
Hi,

 

I am trying to figure out a way that I can query a Lucene index for a
phrase but have some fuzziness (edit distance and/or wildcard) applied
to the individual terms.  An example should help explain what I am
trying to do:

 

Index contains:

Philadelphia PA

 

Search is done on:

Philadelphid PA

 

I want it to result in a hit - basically something like
"Philadelphid~0.75 PA" (that syntax is not valid but explains what I am
looking for).  Similarly, I would like to be able to do something like
"Phil* PA" and get a hit as well.

 

Does anyone know how I can accomplish this?  Right now I am having to
hit a look up table to translate the city before searching against the
main index - not a fan of this option.

 

Thanks.

 

-Jeff Plater

 



Re: Edit distance and wildcard searching with PhraseQuery

2009-11-11 Thread AHMET ARSLAN
What you are looking for is ComplexPhraseQueryParser [1] and implemented in 
Lucene 2.9.0. It uses SpanQuery family. 
It supports "Phil* PA"~10 as well as "Philadelphid~0.75 PA".
Ranges, OR, fuzzy and wildcard inside proximity (phrases).


[1] 
http://lucene.apache.org/java/2_9_0/api/contrib-misc/org/apache/lucene/queryParser/complexPhrase/package-summary.html

[2] https://issues.apache.org/jira/browse/LUCENE-1486

 
> I am trying to figure out a way that I can query a Lucene
> index for a
> phrase but have some fuzziness (edit distance and/or
> wildcard) applied
> to the individual terms.  An example should help
> explain what I am
> trying to do:
> 
>  
> 
> Index contains:
> 
> Philadelphia PA
> 
>  
> 
> Search is done on:
> 
> Philadelphid PA
> 
>  
> 
> I want it to result in a hit - basically something like
> "Philadelphid~0.75 PA" (that syntax is not valid but
> explains what I am
> looking for).  Similarly, I would like to be able to
> do something like
> "Phil* PA" and get a hit as well.






-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Wrapping IndexSearcher so that it is safe?

2009-11-11 Thread Erick Erickson
If you want to spend a few bucks, here's part of a reply to a similar
question
from Mike McCandless a day or so ago

<<<
You can get the book here http://www.manning.com/hatcher3 (NOTE: I'm
one of the authors!).

Chapter 11 in the book has a class called SearcherManager, that
handles the details of reopen/closing the IndexReader while queries
are still in flight, that might be useful here.
>>>

The book is Lucene In Action II. Manning has an "early access program"
(MEAP) that lets you get a PDF version. That class is considerably more
extensive and handles the edge cases as I remember it

Best
Erick


On Wed, Nov 11, 2009 at 5:41 PM, Jacob Rhoden wrote:

> I knew I would have overlooked something, thanks for the help!
>
> On 12/11/2009, at 9:21 AM, Uwe Schindler wrote:
>
>  simply do not catch and rethrow the IOException, instead put release
>> in a
>>
>> finally block and let the IOException automatically go upwards.
>>
>> this.retain();
>>>try {
>>>TopDocs result = searcher.search(query, limit);
>>>return result;
>>>} finally {
>>>this.release();
>>>}
>>>
>>
>> Less code more secure and effective :-)
>>
>
>
> 
> Information Technology Services,
> The University of Melbourne
>
> Email: jrho...@unimelb.edu.au
> Phone: +61 3 8344 2884
> Mobile: +61 4 1095 7575
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


RE: Edit distance and wildcard searching with PhraseQuery

2009-11-11 Thread Jeff Plater
Thanks - I tried it out and it seems to work for "Philadelphid~0.75 PA" but I 
can't get it working for "Phil* PA" yet.  Perhaps it is an issue with my 
Analyzer (I am using WhitespaceAnalyzer)?.  Have you used it with wildcard 
before?

-Jeff

-Original Message-
From: AHMET ARSLAN [mailto:iori...@yahoo.com] 
Sent: Wednesday, November 11, 2009 5:55 PM
To: java-user@lucene.apache.org
Subject: Re: Edit distance and wildcard searching with PhraseQuery

What you are looking for is ComplexPhraseQueryParser [1] and implemented in 
Lucene 2.9.0. It uses SpanQuery family. 
It supports "Phil* PA"~10 as well as "Philadelphid~0.75 PA".
Ranges, OR, fuzzy and wildcard inside proximity (phrases).


[1] 
http://lucene.apache.org/java/2_9_0/api/contrib-misc/org/apache/lucene/queryParser/complexPhrase/package-summary.html

[2] https://issues.apache.org/jira/browse/LUCENE-1486

 
> I am trying to figure out a way that I can query a Lucene
> index for a
> phrase but have some fuzziness (edit distance and/or
> wildcard) applied
> to the individual terms.  An example should help
> explain what I am
> trying to do:
> 
>  
> 
> Index contains:
> 
> Philadelphia PA
> 
>  
> 
> Search is done on:
> 
> Philadelphid PA
> 
>  
> 
> I want it to result in a hit - basically something like
> "Philadelphid~0.75 PA" (that syntax is not valid but
> explains what I am
> looking for).  Similarly, I would like to be able to
> do something like
> "Phil* PA" and get a hit as well.




  

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Edit distance and wildcard searching with PhraseQuery

2009-11-11 Thread Erick Erickson
I'd at use something that lowercases the input rather than just
WhitespaceAnalyzer. Remember to use it at index time and query time. Between
your queries and typing things in e-mails, case is often a gotcha.

At least carefully check that your casing is identical.

Best
Erick

On Wed, Nov 11, 2009 at 6:41 PM, Jeff Plater <
jpla...@healthmarketscience.com> wrote:

> Thanks - I tried it out and it seems to work for "Philadelphid~0.75 PA" but
> I can't get it working for "Phil* PA" yet.  Perhaps it is an issue with my
> Analyzer (I am using WhitespaceAnalyzer)?.  Have you used it with wildcard
> before?
>
> -Jeff
>
> -Original Message-
> From: AHMET ARSLAN [mailto:iori...@yahoo.com]
> Sent: Wednesday, November 11, 2009 5:55 PM
> To: java-user@lucene.apache.org
> Subject: Re: Edit distance and wildcard searching with PhraseQuery
>
> What you are looking for is ComplexPhraseQueryParser [1] and implemented in
> Lucene 2.9.0. It uses SpanQuery family.
> It supports "Phil* PA"~10 as well as "Philadelphid~0.75 PA".
> Ranges, OR, fuzzy and wildcard inside proximity (phrases).
>
>
> [1]
> http://lucene.apache.org/java/2_9_0/api/contrib-misc/org/apache/lucene/queryParser/complexPhrase/package-summary.html
>
> [2] https://issues.apache.org/jira/browse/LUCENE-1486
>
>
> > I am trying to figure out a way that I can query a Lucene
> > index for a
> > phrase but have some fuzziness (edit distance and/or
> > wildcard) applied
> > to the individual terms.  An example should help
> > explain what I am
> > trying to do:
> >
> >
> >
> > Index contains:
> >
> > Philadelphia PA
> >
> >
> >
> > Search is done on:
> >
> > Philadelphid PA
> >
> >
> >
> > I want it to result in a hit - basically something like
> > "Philadelphid~0.75 PA" (that syntax is not valid but
> > explains what I am
> > looking for).  Similarly, I would like to be able to
> > do something like
> > "Phil* PA" and get a hit as well.
>
>
>
>
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: Wrapping IndexSearcher so that it is safe?

2009-11-11 Thread Jacob Rhoden

The source code for SearcherManager is even downloadable for free:
   http://www.manning.com/hatcher3/LIAsourcecode.zip

The example source code does some things that is beyond my level of  
understanding

of lucene. ie:
1) To me it looks like an IndexSearcher never gets closed.
2) I don't understand what happens if the indexreader is reopened  
while a thread

in the middle of a search using an indexsearcher.

So I am going for something a bit simpler:

If a thread wants to use the "SafeIndexSearcher", it first calls  
retain() and then calls

release() when its done.

If a thread wants to close the "SafeIndexSearcher" , the close is  
deferred until all threads

have called release():


public class SafeIndexSearcher {

private boolean finish = false;
private int retainCount = 0;
private IndexSearcher searcher;

public SafeIndexSearcher(IndexSearcher searcher) {
this.searcher = searcher;
}

public TopDocs search(Query query, int limit) throws IOException {
TopDocs result = searcher.search(query, limit);
return result;
}

	public Document doc(int doc) throws CorruptIndexException,  
IOException {

return searcher.doc(doc);
}

public synchronized void close() {
finish = true;
}

public synchronized SafeIndexSearcher retain() throws IOException {
if(finish)
			throw new IOException("SafeIndexSearcher used after close has been  
called.");

retainCount++;
return this;
}

public synchronized SafeIndexSearcher release() {
retainCount--;
if(finish && retainCount==0)
try {
searcher.close();
} catch (IOException e) {
System.err.println("IndexSearcher.close() unexpected error: " +  
e.getMessage());

}
return this;
}

}

On 12/11/2009, at 9:53 AM, Erick Erickson wrote:


If you want to spend a few bucks, here's part of a reply to a similar
question
from Mike McCandless a day or so ago

<<<
You can get the book here http://www.manning.com/hatcher3 (NOTE: I'm
one of the authors!).

Chapter 11 in the book has a class called SearcherManager, that
handles the details of reopen/closing the IndexReader while queries
are still in flight, that might be useful here.




The book is Lucene In Action II. Manning has an "early access program"
(MEAP) that lets you get a PDF version. That class is considerably  
more

extensive and handles the edge cases as I remember it

Best
Erick


On Wed, Nov 11, 2009 at 5:41 PM, Jacob Rhoden  
wrote:



I knew I would have overlooked something, thanks for the help!

On 12/11/2009, at 9:21 AM, Uwe Schindler wrote:

simply do not catch and rethrow the IOException, instead put  
release

in a

finally block and let the IOException automatically go upwards.

   this.retain();

  try {
  TopDocs result = searcher.search(query,  
limit);

  return result;
  } finally {
  this.release();
  }



Less code more secure and effective :-)





Information Technology Services,
The University of Melbourne

Email: jrho...@unimelb.edu.au
Phone: +61 3 8344 2884
Mobile: +61 4 1095 7575


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org





Information Technology Services,
The University of Melbourne

Email: jrho...@unimelb.edu.au
Phone: +61 3 8344 2884
Mobile: +61 4 1095 7575



RE: Edit distance and wildcard searching with PhraseQuery

2009-11-11 Thread Jeff Plater
Thanks for the suggestion - I double checked the case and it was OK.
Turned out I needed to use the StandardAnalyzer instead of the
WhitespaceAnalyzer.

-Jeff

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, November 11, 2009 6:52 PM
To: java-user@lucene.apache.org
Subject: Re: Edit distance and wildcard searching with PhraseQuery

I'd at use something that lowercases the input rather than just
WhitespaceAnalyzer. Remember to use it at index time and query time.
Between
your queries and typing things in e-mails, case is often a gotcha.

At least carefully check that your casing is identical.

Best
Erick

On Wed, Nov 11, 2009 at 6:41 PM, Jeff Plater <
jpla...@healthmarketscience.com> wrote:

> Thanks - I tried it out and it seems to work for "Philadelphid~0.75
PA" but
> I can't get it working for "Phil* PA" yet.  Perhaps it is an issue
with my
> Analyzer (I am using WhitespaceAnalyzer)?.  Have you used it with
wildcard
> before?
>
> -Jeff
>
> -Original Message-
> From: AHMET ARSLAN [mailto:iori...@yahoo.com]
> Sent: Wednesday, November 11, 2009 5:55 PM
> To: java-user@lucene.apache.org
> Subject: Re: Edit distance and wildcard searching with PhraseQuery
>
> What you are looking for is ComplexPhraseQueryParser [1] and
implemented in
> Lucene 2.9.0. It uses SpanQuery family.
> It supports "Phil* PA"~10 as well as "Philadelphid~0.75 PA".
> Ranges, OR, fuzzy and wildcard inside proximity (phrases).
>
>
> [1]
>
http://lucene.apache.org/java/2_9_0/api/contrib-misc/org/apache/lucene/q
ueryParser/complexPhrase/package-summary.html
>
> [2] https://issues.apache.org/jira/browse/LUCENE-1486
>
>
> > I am trying to figure out a way that I can query a Lucene
> > index for a
> > phrase but have some fuzziness (edit distance and/or
> > wildcard) applied
> > to the individual terms.  An example should help
> > explain what I am
> > trying to do:
> >
> >
> >
> > Index contains:
> >
> > Philadelphia PA
> >
> >
> >
> > Search is done on:
> >
> > Philadelphid PA
> >
> >
> >
> > I want it to result in a hit - basically something like
> > "Philadelphid~0.75 PA" (that syntax is not valid but
> > explains what I am
> > looking for).  Similarly, I would like to be able to
> > do something like
> > "Phil* PA" and get a hit as well.
>
>
>
>
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org