from:"Andy Goodell"

Re: Scaling out/up or a mix

2009-06-30 Thread Andy Goodell

I have improved date-sorted searching performance pretty dramatically by replacing the two step "search then sort" operation with a one step "use the date as the score" algorithm. The main gotcha was making sure to not affect which results get counted as hits in boolean searches, but overall I onl

phrases and slop

2008-08-28 Thread Andy Goodell

I thought I understood phrases and slop until one of my coworkers brought by the following example For a document that contains "quick brown fox" "quick brown fox"~0 "quick fox brown"~2 "fox quick brown"~3 all match. I would have expected "fox quick brown" to require a 4 instead of a 3, two to

Re: Indexing Wikipedia dumps

2007-12-12 Thread Andy Goodell

My firm uses a parser based on javax.xml.stream.XMLStreamReader to break (english and nonenglish) wikipedia xml dumps into lucene-style "documents and fields." We use wikipedia to test our language-specific code, so we've probably indexed 20 wikipedia dumps. - andy g On Dec 11, 2007 9:35 PM, Oti

Searching with a score cutoff

2007-06-04 Thread Andy Goodell

Currently our application implements a score cutoff by iterating through the hits and then stopping once it reaches a hit whose score is below our threshold. We'd like to optimize this (and avoid looking at the entire hits when we don't need to) by having the score cutoff applied when the hits ar

Re: How many Searches is a Searcher Worth?

2007-04-05 Thread Andy Goodell

My approach to dealing with these kinds of issues (which has worked well for me thus far) is: - Run java with -XX:+HeapDumpOnOutOfMemoryError command-line option - use jhat to inspect the heap dump, like so: $ /usr/java/jdk1.6/bin/jhat ./java_pid1347.hprof jhat will take a while to parse the hea

Re: performance differences between 1.4.3 and 1.9.1

2006-04-26 Thread Andy Goodell

For my application we have several hundred indexes, different subsets of which are searched depending on the situation. Aside from not upgrading to lucene 1.9, or making a big index for every possible subset, do you have any ideas for how can we maintain fast performance? - andy g On 4/26/06, Da

Query to return all documents in the index

2005-10-05 Thread Andy Goodell

Hi, In my project we've been using the Searcher.search(query, filter, sort) method to gather results. But as it turns out, sometimes we just want all of the documents that match with the filter, sorted by the sort field. Does anyone know a query that returns all the documents in the index, so that

Re: Scaling out/up or a mix

phrases and slop

Re: Indexing Wikipedia dumps

Searching with a score cutoff

Re: How many Searches is a Searcher Worth?

Re: performance differences between 1.4.3 and 1.9.1

Query to return all documents in the index

7 matches

Site Navigation

Mail list logo

Footer information