Re: Indexing questions

2008-07-14 Thread Anshum
Hi, As per my knowledge, you may do any of the below processes while searching (n parallel) just that the changes would not reflect until you reopen the index readers (by either using the reopen command or closing and opening them explicitly). But the downside to this would be, in case your daemon

Re: matching sub phrases in user entered query...

2008-07-14 Thread Preetam Rao
Hi Steve, It would be simpler if I have a query called SubPhraseQuery in which case I do not have to either generate extra terms during ingestion or generate extra queries during querying. As a user, the best I would hope for is, to ingest the data from some feed into different fields, run the use

Stable score scaling; LSI again

2008-07-14 Thread Asad Sayeed
Hi, I have a couple of questions about how to alter the similarity scores. I need scores that can be thresholded, and whose thresholds remain stable even when I add documents to the IndexWriter. ie, identity should be a fixed value such as 1.0. I know that for efficiency reasons, Lucene doesn't do

MultiSearcher and TopFieldDocCollector

2008-07-14 Thread Declan Newman
Hi, I'm in the process of trying to optimize searches and avoid the dreaded OutOfMemoryError s. We currently return the entire document from each of the search results and then filter the results using parameters obtained from a database. Not very efficient. The idea was to override TopFie

Re: Sorting case-insensitively

2008-07-14 Thread Paul J. Lucas
On Jul 10, 2008, at 2:24 PM, Chris Hostetter wrote: if you could submit a test case that reproduces this using a trivial subclass (just return the orriginal String as the Comparable) that can help us verify the bug and the fix. See my e-mail dated July 3, 2008. Assuming i'm right, I don'tr

RE: matching sub phrases in user entered query...

2008-07-14 Thread Steven A Rowe
Hi Preetam, On 07/14/2008 at 1:40 PM, Preetam Rao wrote: > Is there a query in Lucene which matches sub phrases ? > [snip] > > I was redirected to Shingle filter which is a token filter > that spits out n-grams. But it does not seem to be best solution > since one does not know in advance what n

matching sub phrases in user entered query...

2008-07-14 Thread Preetam Rao
Hi, Sorry if you get this mail second or third time. Getting mail delivery errors from gmail for some unknown reason. This is my last attempt at sending the mail for the day.. :-) Is there a query in Lucene which matches sub phrases ? For example if the document text is "new york existing homes

matching sub phrases in user entered query...

2008-07-14 Thread Preetam Rao
Hi, Sorry if you get this mail second time. Having some trouble with mail box. Is there a query in Lucene which matches sub phrases ? For example if the document text is "new york existing homes *3 bed 2 bath*homes 3 miles from city center 2 rooms" and if user enters "Brooklyn homes with *3 bed

matching sub phrases from user entered query...

2008-07-14 Thread Preetam Rao
Hi, Is there a query in Lucene which matches sub phrases ? For example if the document text is "new york existing homes *3 bed 2 bath*homes 3 miles from city center 2 rooms" and if user enters "Brooklyn homes with *3 bed *rooms and swimming pools", I would like to recognize the fact the the doc

Re: Boolean expression for no terms OR matching a wildcard

2008-07-14 Thread Ron Rudy
Can I assume that since nobody replied to this that there's no way to perform this kind of search? What I think I need is two different types of conditions: 1) a wildcard conditional that is forced to match against all indexed values for a field 2) a conditional that matches when NO values at all

Cosine Similarity between two documents, using different zone weights

2008-07-14 Thread Asterios Katsifodimos
Hello *, I have been trying to find an *efficient *(in terms of performance) way to get the Cosine Similarity between two Lucene Documents. I have seen that this can be done with: 1. Converting the document into a query and submitting the query, getting the results and their score. --TOO

Re: Indexing questions

2008-07-14 Thread Michael McCandless
The answer to all 3 is yes, but, you'll have to re-open your IndexReader to see any of those changes. An IndexReader always searches the "point in time" snapshot of the index as of the moment it was opened. Any & all changes done with an IndexWriter (including opening a new index in the

ArrayList or HashMap

2008-07-14 Thread blazingwolf7
Hi, I am working on extracting information from around 2 to 3 million document and place it into the memory to retrieve it for filtering search results. The application will have to extract the information and store it for every search. I am wondering what will be the best way to store this info