Lucene spending alot of time in BooleanScorer2

2011-05-02 Thread Paul Taylor
Hi Nearing completion on a new version of a lucene search component for the http://www.musicbrainz.org music database and having a problem with performance. There are a number of indexes each built from data in a database, there is one index for albums, another for artists, and another for tr

AW: "fuzzy prefix" search

2011-05-02 Thread Clemens Wyss
Is it the combination of FuzzyQuery and Term which makes the search to go for "word boundaries"? > -Ursprüngliche Nachricht- > Von: Clemens Wyss [mailto:clemens...@mysign.ch] > Gesendet: Montag, 2. Mai 2011 14:13 > An: java-user@lucene.apache.org > Betreff: AW: "fuzzy prefix" search > >

RE: Link to nightly build test reports on main Lucene site needs updating

2011-05-02 Thread Burton-West, Tom
Thanks for fixing++ Tom -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Sunday, May 01, 2011 6:05 AM To: d...@lucene.apache.org; simon.willna...@gmail.com; java-user@lucene.apache.org Subject: RE: Link to nightly build test reports on main Lucene site needs updat

Re: MultiPhraseQuery slowing down over time in Lucene 3.1

2011-05-02 Thread Otis Gospodnetic
Hi, I think this describes what's going on: 10 load N stored queries 20 parse N stored queries, keep them in some List forever 30 for each incoming document create a new MemoryIndex instance "mi" 40 for query 1 to N do mi.search(query) Over time this step 40 takes longer and longer and longer --

Re: MultiPhraseQuery slowing down over time in Lucene 3.1

2011-05-02 Thread Michael McCandless
By "slowing down over time" do you mean you use the same index (no new docs added) yet running the same MPQ over and over you see it taking longer to execute over time? Mike http://blog.mikemccandless.com On Mon, May 2, 2011 at 12:00 PM, Tomislav Poljak wrote: > Hi, > after running tests on bot

RE: MultiPhraseQuery slowing down over time in Lucene 3.1

2011-05-02 Thread Uwe Schindler
Can you checkout latest 3.1 branch @ https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1 And test if it solves your issue. There was aproblem in PhraseQuery's internal sorting and quicksort. It does not slowdown over time, but with type of query (how many terms the phrase contains

MultiPhraseQuery slowing down over time in Lucene 3.1

2011-05-02 Thread Tomislav Poljak
Hi, after running tests on both MemoryIndex and RAMDirectory based index in Lucene 3.1, seems MultiPhraseQueries are slowing down over time (each iteration of executing the same MultiPhraseQueries on the same doc, seems to require more and more execution time). Are there any existing/known issues r

Re: questions about the index

2011-05-02 Thread Michael McCandless
On Mon, May 2, 2011 at 9:17 AM, Bernd Fehling wrote: > Dear list, > > some questions about the index. > (questions go to the lucene list because it is more about the index itself) > > First my results from CheckIndex: > > Segments file=segments_l6 numSegments=1 version=FORMAT_3_1 [Lucene 3.1] > Ch

questions about the index

2011-05-02 Thread Bernd Fehling
Dear list, some questions about the index. (questions go to the lucene list because it is more about the index itself) First my results from CheckIndex: Segments file=segments_l6 numSegments=1 version=FORMAT_3_1 [Lucene 3.1] Checking only these segments: _79s: 1 of 1: name=_79s docCount=28146

AW: "fuzzy prefix" search

2011-05-02 Thread Clemens Wyss
I tried this too, but unfortunately I only get hits when the search term is a least as long as the word to be looked up. E.g.: ... Directory directory = new RAMDirectory(); IndexWriter indexWriter = new IndexWriter( directory, IndexManager.getIndexingAnalyzer( LOCALE_DE ), IndexW

RE: "fuzzy prefix" search

2011-05-02 Thread Uwe Schindler
Hi, You can pass an integer to FuzzyQuery which defines the number of characters that are seen as prefix. So all terms must match this prefix and the rest of each term is matched using fuzzy. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi

"fuzzy prefix" search

2011-05-02 Thread Clemens Wyss
I'd like to search fuzzily but not on a full term. E.g. I have a text "Merlot del Ticino" I'd like "mer", "merr", "melo", ... to match. If I use FuzzyQuery only "merlot, "merlott" hit. What Query-combination should I use? Thx Clemens --

Re: ComplexPhraseQueryParser with multiple fields

2011-05-02 Thread Ahmet Arslan
Hi, I've just started using the ComplexPhraseQueryParser and it works great with one field but is there a way for it to work with multiple fields?  For example, right now the query: job_title: "sales man*" AND NOT contact_name: "Chris Salem" throws this exception Caused by: org.apache.lucene.q