Been a while since I've been in the benchmark stuff, so I am going to
take some time to look at this when I get a chance, but off the cuff I
think you are open and closing the reader for each search. Try using the
openreader task before the 100 searches and then the closereader task.
That will
One thing that others have tried is to keep a RAMindex that you
use for your modifications. That is, an index that *only* has your
mods, not your original index. But, and here's the key, when you
update, you update BOTH your RAM and FS based indexes.
When searching, you search BOTH indexes, giving
On 03/11/2008, at 11:07 PM, Mark Miller wrote:
Am I missing your benchmark algorithm somewhere? We need it.
Something doesn't make sense.
I thought I had included in at[1] before but apparently not, my
apologies for that. I have updated that wiki page. I'll also reproduce
it here:
{ "Ro
I had a question about more about Best Practices and reading from an
IndexWriter.
Currently, we have an index which we call the master index. This index, in
itself, represents our data model. Many clients can access this index.
However, we have importer and updating clients which essentially add
Pablo,
Would you mind adding a little more detail about how you're working
around the problem?
I'm still evaluating our different options so am interested in what you did.
Todd
On Mon, Nov 3, 2008 at 2:37 PM, PabloS <[EMAIL PROTECTED]> wrote:
>
> Thanks hossman, but I've already 'solved' the pr
Thanks hossman, but I've already 'solved' the problem without the need to
patch lucene. I had to code a bit around Lucene's visibility restrictions
but I've managed to completely skip the field caching mechanism and add
ehcache to it.
At the moment it seems to be working quite well, although not
part of the check , this exception is
thrown:
Error: could not read any segments file in directory
java.io.FileNotFoundException: no segments* file found in
[EMAIL PROTECTED]/rt10/jetty/20081103
at
org.apache.lucene.index.SegmentInfos
$findSegmentsFile.run(SegementInfos.java:587)
Hi,
I'd like to find documents that are similar to the one I have
in the index (or the one I am abuot to add, if there is no
similar document... I prefer this way if possible).
If I understand it correctly, I should be able to use
TermFreqVector for this. I wanted to tell Lucene,
"search for simil
I have a lucene search and I want to implement a way to sort the search by
giving one search term more importance than another and sort it by the
scores i'm getting. What would be the best way to do something like this?
--
View this message in context:
http://www.nabble.com/Sort-search-by-weight
: I'm having a similar problem with my application, although we are using
: lucene 2.3.2. The problem we have is that we are required to sort on most of
: the fields (20 at least). Is there any way of changing the cache being used?
there is a patch in Jira that takes a completley different approa
part of the check , this exception is thrown:
Error: could not read any segments file in directory
java.io.FileNotFoundException: no segments* file found in
[EMAIL PROTECTED]/rt10/jetty/20081103
at
org.apache.lucene.index.SegmentInfos$findSegmentsFile.run(SegementInfos.java:587)
.
Is
Thank you both for your help.
> Date: Fri, 31 Oct 2008 09:06:50 +0100
> From: [EMAIL PROTECTED]
> To: java-user@lucene.apache.org
> Subject: Re: Read all the data from an index
>
> Erick Erickson wrote:
> > I'm not sure what *could* be easier than looping with IndexSearcher.doc(),
> > looping fro
On Mon, 2008-11-03 at 04:42 +0100, Justus Pendleton wrote:
> 1. Why does the merge factor of 4 appear to be faster than the merge
> factor of 2?
Because you alternate between updating the index and searching? With 4
segments, chances are that most of the segment-data will be unchanged
between sear
Am I missing your benchmark algorithm somewhere? We need it. Something
doesn't make sense.
- Mark
Justus Pendleton wrote:
Howdy,
I have a couple of questions regarding some Lucene benchmarking and
what the results mean[3]. (Skip to the numbered list at the end if you
don't want to read the
Hello Justus, Chris and Otis,
IIRC Ocean [1] by Jason Rutherglen addresses the issue for real time
searches on large data sets. A conceptually comparable implementation is
done for Jackrabbit, where you can see an enlighting picture over here
[2]. In short:
1) IndexReaders are opened only once
15 matches
Mail list logo