Re: PhraseQuery Performance Issues [Lucene 2.9.0]

2010-03-19 Thread Michael McCandless
Nutch/Solr's CommonGrams is the right way to solve this. It combines frequent terms (eg stopwords) with adjacent terms. So "the wizard of oz" will be indexed eg as the_wizard wizard_of of_oz. It'll require a full re-index though, and you have to fixup searching so that the same term expansion wo

PhraseQuery Performance Issues [Lucene 2.9.0]

2010-03-19 Thread Daniel Shane
I'm running a medium size web search with a index size just shy of 9GB with 80 docs in it. We are suing Lucene version 2.9.0 (we have not checked yet to see if this applies to older versions as well). By looking at my logs, I'm finding that phrase queries are especially long to perform. In

Re: Lucene 3.0 Search Performance Stats

2010-03-19 Thread Monique Monteiro
Hi Jamie, could you please tell us how much memory does your application consume with Lucene? I'm asking it because we are having memory consumption problems with a 32GB index and 1.5GB od RAM allocated to our web application. At the momento, we use textual search. Thanks in advance, Monique On

[JOB] Java Developer with Search Experience Needed in NYC

2010-03-19 Thread Avi Flax
Sorry for the spam, I'm hoping this is relevant enough that you all won't mind too much. Please see our post here: http://jobs.37signals.com/jobs/6312 It doesn't mention search, but we have a big search project coming up — we're going to start evaluating platforms and technologies soon. If you'r

Re: Prefix And Fuzzy

2010-03-19 Thread Anshum
Hi Veera, I'd say you should get yourself a copy of Lucene In Action 2 Ed. It would really help you figure out the hows and why's in case you are looking at using lucene further. Talking about your solution, let me explain it to you a bit. When you analyse a particular field in lucene, it gets toke

Prefix And Fuzzy

2010-03-19 Thread vhanuman kumarbaburavi
Dear Paul, Still I am fighting with Prefix and Fuzzy search. Based on archive post I have noticed once thing i.e. when the prefix search will work only for "Field.Index.NOT_ANALYZED" type. I have changed that one prefix working perfectly. But the fuzzy search is working stopped, I hope becaus

Re: Version.onOrAfter() complaing its Deprecated but it isnt

2010-03-19 Thread Paul Taylor
kumarbabu ravi wrote: I would like to say thanks first. But I have a small problem on both prefix and Fuzzy search. Is it possible to perform both Prefix and Fuzzy search at time. I mean to say using FuzzyPrefixLength and setFuzzyMinSim. Please suggest me ASAP. kumarvarbi, thanks for hijacki

Re: Version.onOrAfter() complaing its Deprecated but it isnt

2010-03-19 Thread kumarbabu ravi
I would like to say thanks first. But I have a small problem on both prefix and Fuzzy search. Is it possible to perform both Prefix and Fuzzy search at time. I mean to say using FuzzyPrefixLength and setFuzzyMinSim. Please suggest me ASAP. On Fri, Mar 19, 2010 at 5:17 PM, Ian Lea wrote: > 1.

Re: Lucene 3.0 Search Performance Stats

2010-03-19 Thread Michael McCandless
Very nice! Thanks for sharing :) Mike On Fri, Mar 19, 2010 at 6:53 AM, Jamie wrote: > I forgot to point out, this is a search using the Lucene realtime search > feature. We get the reader from indexwriter.getReader() for each search. > > On 2010/03/19 01:49 PM, Jamie wrote: >> >> Hi Guys >> >>

Re: Lucene 3.0 Search Performance Stats

2010-03-19 Thread Jamie
I forgot to point out, this is a search using the Lucene realtime search feature. We get the reader from indexwriter.getReader() for each search. On 2010/03/19 01:49 PM, Jamie wrote: Hi Guys I just wanted to congratulate the Lucene guys for a fine job on 3.0!! Since we switched our indexes to

Lucene 3.0 Search Performance Stats

2010-03-19 Thread Jamie
Hi Guys I just wanted to congratulate the Lucene guys for a fine job on 3.0!! Since we switched our indexes to using integer based range queries based on Date (YYMMHHSS), search speed is lightening fast and memory consumption has dropped considerably! Some stats: Indexed Docs: 7.2M emails I

Re: Version.onOrAfter() complaing its Deprecated but it isnt

2010-03-19 Thread Ian Lea
1. Please ask new questions in new threads. 2. Read the "Why am I getting no hits / incorrect hits?" section of the FAQ. If that doesn't help post again (in a new thread) showing us how you are indexing the search field and creating the query and what the toString() method shows. And use Luke

RE: Version.onOrAfter() complaing its Deprecated but it isnt

2010-03-19 Thread vhanuman kumarbaburavi
Hi Paul, I need some help on Lucene prefix search. I have implemented prefix search in my application but it is return different result like following way. E.g. three products see below Abc def zse, def sde sed, fed fer def. I entered the characters like "def" and I performed prefix search, I

Version.onOrAfter() complaing its Deprecated but it isnt

2010-03-19 Thread Paul Taylor
Hi since downloading Lucene 3.1 my code complains that Version.onOrAfter() complaing its deprecated but i also have svn access to the source and it isn't deprecated , and doesnt look like it ever has been, anyone else get this ? Paul ---

Re: Corrupt index? Can I recover it?

2010-03-19 Thread Michael McCandless
It sounds like you should first run CheckIndex to see if there's corruption if both indexes... and then run again with -fix to repair the corruption. That repair simply removes any segments with corruption. So after that you'll have to re-index whatever docs are missing... Mike On Thu, Mar 18,