Re: What is the best practice of using synonymy ?

2010-03-23 Thread Jeff Zhang
Ahmet, Thanks for your suggestion, and could you explain more about this or give me a refer article that explains the reason in details ? Thanks On Tue, Mar 23, 2010 at 6:33 PM, Ahmet Arslan wrote: > > > > I'd like to use the synonymy in my project. And I think > > there's two > > candidates s

RE: Lucene query with long strings

2010-03-23 Thread Steven A Rowe
Hi Aaron, Your "false positives" comments point to a mismatch between what you're currently asking Lucene for (any document matching any one of the terms in the query) and what you want (only fully "correct" matches). You need to identify the terms of the query that MUST match and tell Lucene

Re: Lucene query with long strings

2010-03-23 Thread Ahmet Arslan
> hi all, I have been playing > with Lucene for a while now, but stuck on a perplexing > issue. > > I have an index, with a field "Affiliation", some example > values are: > > - "Stanford University School of Medicine, Palo Alto, CA > USA", > - "Institute of Neurobiology, School of Medicine, Sta

Lucene query with long strings

2010-03-23 Thread Aaron Schon
hi all, I have been playing with Lucene for a while now, but stuck on a perplexing issue. I have an index, with a field "Affiliation", some example values are: - "Stanford University School of Medicine, Palo Alto, CA USA", - "Institute of Neurobiology, School of Medicine, Stanford University, P

Re: BooleanQuery and SpanQuery : ho w to get « combined » spans?

2010-03-23 Thread Benoit Mercier
Thank you Grant. I will try your suggested approach. It confirms to me that I wasn't lost too much;-) mercibe Grant Ingersoll a écrit : On Mar 23, 2010, at 12:58 AM, Benoit Mercier wrote: Hi, I would like to write a query composed of a BooleanQuery (several clauses) and a SpanQuery (Spa

Re: BooleanQuery and SpanQuery : how to get « c ombined » spans?

2010-03-23 Thread Grant Ingersoll
On Mar 23, 2010, at 12:58 AM, Benoit Mercier wrote: > Hi, > > I would like to write a query composed of a BooleanQuery (several clauses) > and a SpanQuery (SpanNearQuery), where both are mandatory. Sounds simple > but I have to work on spans returned by this query. > > I know that I could u

答复: another question about phras equery?

2010-03-23 Thread luocanrao
Hi , Ian Lea I mean that query for "little boy" will match the both document. the Document that only has terms "boy" and "little" will match the query. the document one add some sore because it exactly match the query(term position totally match). do I describe clearly? for example Document 1:

Re: how lucene search works in memory

2010-03-23 Thread Erick Erickson
First, I'd be sure you need to. See the following: http://wiki.apache.org/lucene-java/ImproveSearchingSpeed A lot of very bright people have worked very hard at optimizing Lucene's search *and* the op system caching. I'd carefully examine m

Re: What is the best practice of using synonymy ?

2010-03-23 Thread Anshum
Index time is a much better approach. The only negative about it is the index size increase. I've used it for a considerable sized dataset and even the index time doesn't seem to go up considerably. Searching of multiple terms is generally unoptimized when you can do it with 1. -- Anshum Gupta Nau

Re: how lucene search works in memory

2010-03-23 Thread Anshum
Hi Suman, I couldn't find a link but talking about approaches to load an index into memory would be : 1. Create a tmpfs partition and copy your index into the partition, open the index reader/searcher from the tmpfs. * You would have to handle the copying/management of indexes in this case. * I

Re: using lucene from apache threads

2010-03-23 Thread Ian Lea
Indexes aren't exactly loaded into memory when opened, but your approach certainly is inefficient. A common alternative is to have apache talking to e.g. tomcat and tomcat will keep an index open. Or send queries to solr or a daemon or whatever. Your cgi scripts could run a program which talks t

Re: What is the best practice of using synonymy ?

2010-03-23 Thread Ahmet Arslan
> I'd like to use the synonymy in my project. And I think > there's two > candidates solution : > 1. using the synonymy in the indexing stage, enhance the > index by using > synonymy > 2. using the synonymy in the search stage, enhance the > search query by > synonymy . > > I'd like to know whic

Re: SpanFirstQuery and PrefixQuery combined?

2010-03-23 Thread Ahmet Arslan
> I¹m trying to do a search over an index with names. > > This is how I currently create a document of the index > > document.add(new Field("id", item.getItemId().toString(), > Field.Store.YES, > Field.Index.NOT_ANALYZED_NO_NORMS)); > document.add(new Field("item.name", > item.getAutoCompleteTex

how lucene search works in memory

2010-03-23 Thread suman . holani
Hello, I am trying for optimizing the searching by putting indexes onto memory. RAMDirectory is not option for me, as I am transferring indexes built to slave system to use. So if u could let me know that how to place my indexes onto memory(m thinking of using mmap) .For this I wanna know how exa

using lucene from apache threads

2010-03-23 Thread suman . holani
Hi, I am using Apache threads to invoke a cgi , which is opening and closing the index searcher for every thread. for all threads , m making searches. Does that mean , for every thread the indexes would be loaded on to memory for searching. coz, then its very inefficient. Is there any method by

SpanFirstQuery and PrefixQuery combined?

2010-03-23 Thread Lukas Österreicher
Hi. I¹m trying to do a search over an index with names. This is how I currently create a document of the index document.add(new Field("id", item.getItemId().toString(), Field.Store.YES, Field.Index.NOT_ANALYZED_NO_NORMS)); document.add(new Field("item.name", item.getAutoCompleteText(), Field.Sto

Re: Keeping the IndexWriter open?

2010-03-23 Thread Michael McCandless
This is perfectly fine. This then makes it possible to use near-real-time readers (IndexWriter.getReader) after you've updated to the index, for fast turnaround on searching. Mike On Tue, Mar 23, 2010 at 4:14 AM, Konstantyn Smirnov wrote: > > Hi all, > > are there any potential dangers in keepi

Keeping the IndexWriter open?

2010-03-23 Thread Konstantyn Smirnov
Hi all, are there any potential dangers in keeping the IndexWriter (which is a singleton in my app) open throughout the whole application life? I have tested it shortly, and it seems to be working fine... Am I missing some pitfalls and caveats? Thanks - Konstantyn Smirnov, CTO http://www

Re: Optimising the lucene search

2010-03-23 Thread Anshum
Hi, I couldn't really get the point here. Do you think you would never have to search the fields separately? Concatenating the fields would mean a lot of information loss and you'd not be able to search the fields for a query like (Field1:X AND Field2:Y ) . If that's the case you could combine the

Optimising the lucene search

2010-03-23 Thread suman . holani
Hello, Optimising the lucene search Use combined search field for all text fields instead (or on the top) of indexing them separately and searching with complex query like field1:query OR field2:query ... OR fieldN:query Reducing number of field make indexing and search much faster. Use combin