: This is because the index is updated every 5 mins or so, due to the incoming
: feed of stories ..
:
: When you say iteration, i take it you mean, search request, well for each
: search that is conducted I create a new one .. search reader that is ..
yeah ... i ment iteration of your test. don'
Ah, I see, I should of course use the same similarity during indexing
and searching. Many thanks!
On 20/08/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: And then I made this subclass the default similarity. It worked well
: for tf but not for lengthNorm. The reason appears to be that the
: Te
yes there is a new searcher opened each time a search is conducted,
This is because the index is updated every 5 mins or so, due to the incoming
feed of stories ..
When you say iteration, i take it you mean, search request, well for each
search that is conducted I create a new one .. search read
: hits = searcher.search(query, new Sort("sid", true));
you don't show where searcher is initialized, and you don't clarify how
you are timing your multiple iterations -- i'm going to guess that you are
opening a new searcher every iteration right?
sorting on a field requires pre-computing a
: And then I made this subclass the default similarity. It worked well
: for tf but not for lengthNorm. The reason appears to be that the
: TermScorer class does not call lengthNorm, but instead uses a cache
Acctually, the lengthNorm method is used by the IndexWriter; it compresses
the float retur
what i am measuring is this
Analyzer analyzer = new StandardAnalyzer(new String[]{});
if(fldArray.length > 1)
{
BooleanClause.Occur[] flags = {BooleanClause.Occur.SHOULD,
BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD,
BooleanClause.Occur.SHOULD};
query = MultiFieldQueryP
This is a lnggg time, I think you're right, it's excessive.
What are you timing? The time to complete the search (i.e. get a Hits object
back) or the total time to assemble the response? Why I ask is that the Hits
object is designed to return the fir st100 or so docs efficiently. Every 10
Hi there,
I have an index with about 250K document, to be indexed full text.
there are 2 types of searches carried out, 1. using 1 field, the other using
4 .. for a query string ...
given the nature of the queries required, all stop words are maintained in
the index, thereby allowing for phrasa
I had a situation where I was only interested in whether the term was
there or not (not how many times), and I didn't want to penalize long
fields. So I wrote a Similariy subclass where I overrided the
following methods as this:
public float lengthNorm(String fieldName, int numTerms) {
ret
I think you can still achieve your desired outcome, but I'm not sure I fully
understand the use case. Can you describe more clearly a specific example
of what you need to achieve?
You are correct that "joins" in lucene aren't really a strong point, but
this is often a by-product of thinking abou
10 matches
Mail list logo