On 19/05/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
i assume when you say this...
: 1. I need to temporarilly index sets of documents on the Fly say 100 at
a
: Time.
you mean that you'll have lots of temporary indexes of a few hundrad
documents and then you'll do a bunch of queries and th
Hello
What is the best way to search? Should I separate all the fields, or
create a big one that have all fields? Does this impact the
performance dramatically?
Creating a big field I would not need to create a BooleanQuery...
last time I did not get any clues, lets see if this time will be bet
Hi All,
We have recently upgraded from lucene 1.4.3 to lucene 1.9.1 version.
After the upgrade, we are facing some issues:
1. Indexing seems to be behaving differently. There were more than 300
segment files(.cfs) in the index and the IndexSearcher is taking forever
to refresh the index. Have t
Hi, Erik,
Thanks for your prompt response.
I didn't dig the source code of lucence deep enough, but I noticed that the
IndexSearcher uses an IndexReader, while the cost of initializing
IndexReader is a bit high.
My application is a webapp, so I think it may be good if I cache some
instances of
On Sunday 21 May 2006 20:01, Chris Hostetter wrote:
> : "wrapping" it with a SpanNearQuery. Unless, there is a way to make
> : Span(Near)Query take a BooleanQuery as its clause. Is there a way to
>
> ope .. span queries can only contain other span queries -- they need the
> sub queries to propogat
: If I use a sort on the datefield and perform a query (with that sort)
: will it always rebuild the whole cache or just the cache for the actual
: hits?
the FieldCache is built for all documents so that it's completleyte
reusable for any search that sorts on that field -- as long as you keep
you
: "wrapping" it with a SpanNearQuery. Unless, there is a way to make
: Span(Near)Query take a BooleanQuery as its clause. Is there a way to
ope .. span queries can only contain other span queries -- they need the
sub queries to propogate up the span information which normal queries
don't know abou
Thanks.
I preffer sorting. But I'm afraid that it won't be enough.
How long time do you think it will take to rebuild the caches?
If I use a sort on the datefield and perform a query (with that sort) will it
always rebuild the whole cache or
just the cache for the actual hits?
/
Marcus
___
On May 21, 2006, at 11:31 AM, Marcus Falck wrote:
I will use Lucene to index 200 million documents (doc size 2kb ->
20 kb).
With the following requirements:
IndexSearcher needs to be created atleast every 5 minute.
The ranking/scoring/sorting will need to reply the hits ordered by
date desc.
Hi,
I will use Lucene to index 200 million documents (doc size 2kb -> 20 kb).
With the following requirements:
IndexSearcher needs to be created atleast every 5 minute.
The ranking/scoring/sorting will need to reply the hits ordered by date desc.
Will the sorting be good enough on a machine wit
Come to think of it... I can only use SpanOrQuery because I'm
"wrapping" it with a SpanNearQuery. Unless, there is a way to make
Span(Near)Query take a BooleanQuery as its clause. Is there a way to
set the min. number of terms to be matched in an OR subquery inside a
SpanNearQuery?
Thanks.
Micha
Hi,
Somehow, after running many searches using instances of SpanQuery
(mostly SpanNearQuery), I get the ArrayIndexOutOfBounds exception:
"bash-2.03$ java.lang.ArrayIndexOutOfBoundsException: 2147483647
at org.apache.lucene.search.spans.SpanScorer.score(SpanScorer.java:72)
at
org.a
Hi Wouter,
My thought would be to go for plan (b) (have not tested it though). This
would produce simply the sum of frequencies of the different terms (I'm
referring to a real multi-term query, not a phrase as you mentioned -
"the man" - which should work).
The problem I see is that it you loose
13 matches
Mail list logo