Hi Marc
I have basically gone through the book Lucene in Action where it
suggest requerying would be better, but I believe it depends on the kind
of application you have. In my case I need to rank the hits according to
some other parameters so I need the total hits at a time then rank it
accord
Hi all,
Suppose the user enters the following query using a textbox interface:
"rate based optimization" (as a phrase query, including the quotes). The
query is parsed using QueryParser, then it is rewritten, and given to
the highlighter. Then, method getBestTextFragments is called.
The met
I'm looking at SpanQueries as I work on new test cases for LUCENE-557, and
I'm confused by the implimentation of SpanFirstQuery.getSpans().
In the Anonymous Spans instance returned, start() and end() are allways
the start() and end() of the inner SpanQuery for the current doc --
shouldn't the sta
Apparently Sun's Niagara servers have a weak FPU, and I don't need
my matches to contain floating point scores, so I would like to
avoid floating point calculations when scoring, if possible.
Doing a quick `grep -R ' float ' *` in the source tree shows a
number of places where floats are used:
Hello,
Apparently Sun's Niagara servers have a weak FPU, and I don't need my matches
to contain floating point scores, so I would like to avoid floating point
calculations when scoring, if possible.
Doing a quick `grep -R ' float ' *` in the source tree shows a number of places
where floats ar
Hi!
[I'm sorry for also posting this on the dev mailing list, but I was not sure
in which one it would be best, so if there is a moderator, please kill
either one.]
I'm planning on contributing to Lucene by adding a new kind of query. I dont
know how to call it yet, but it would be a mix of Bool
hu andy wrote:
Hi, I hava an application that need mark the retrieved documents which have
been read. So the next time I needn't read the marked documents again.
You could mark the documents as deleted, then later clear deletions. So
long as you don't close the IndexReader, the deletions wil
On Fri, Apr 28, 2006 at 01:54:51PM +0800, jason wrote:
> After reading the code, I found the similarity measure in Lucene is not the
> same as the cosine coefficient measure commonly used. I dont know it is
> correct. And I wonder whether i can use the cosine coefficient measure in
> lucene or mayb
Hi Kinnar,
Well, I have quite a few indexes, some of which get
updated infrequently with large loads (quartley) and
then some indexes which will have approx 2000
additions a day.
Originally I planned to store the results on the
session - but I have to design for growth, both in
users and in data
Hi Marc
Can you give some statistics about the amount of data you are indexing ?
Do you not think requering for pagination will increase the time taken
for bringing the hits. Rather than bringing the entire hits once in the
memory then displaying it as and when the user is clicking on the next
but
I'm caching hits by query. When accessing more documents Lucene
automatically re-quering index to retrieve more document.
When index changes then I reopen IndexReader and clear cache.
Marc Dauncey wrote:
I read somewhere recently (maybe even on this list) a
recommendation to requery each time f
Thank you all for the ideas and thanks to the developers for producing such a
great tool. I hadn't considered the "too many clauses" problem in my original
implementation and I'm definitely hitting it.
I decided to use a bi-gram tokenization approach combined with a PhraseQuery to
get the "term
p.s. To avoid that issue you could store the result-sets document ids in
the session.
Marc Dauncey schrieb:
Yes, I was thinking about index updates.
Getting a different result set when you go back to a
previous page might be an issue - could always cache
each page as its opened rather than
On Apr 28, 2006, at 5:35 AM, Eric Jain wrote:
What is the best way to prevent a phrase query such as "eggs white"
matching "fried eggs\nwhite snow"?
Two possibilities I have thought about:
1. Replace all line breaks with a special string, e.g. "newline".
2. Have an analyzer somehow increment
Yes, I was thinking about index updates.
Getting a different result set when you go back to a
previous page might be an issue - could always cache
each page as its opened rather than the entire result
set.
--- Hannes Carl Meyer <[EMAIL PROTECTED]> wrote:
> Hi Marc,
>
> I'm using this met
Hi Marc,
I'm using this method for a web-application. I'm storing only the
current viewable set of documents in the session and re-query if the user
scrolls to the next page. This method is pretty fast and has a minimal
session- and processing-footprint. But, if your index is changed during
scr
I read somewhere recently (maybe even on this list) a
recommendation to requery each time for successive
pages as this avoids some of the complexity involved
in session management. Whats peoples view of this?
Marc
--- karl wettin <[EMAIL PROTECTED]> wrote:
>
> 27 apr 2006 kl. 20.44 skrev Jean
This one's fairly wild, I'm interested to see what the gurus think...
You could create a bitset and mark each document retrieved by the
appropriate bit position (using the Lucene document id). Persist this bitset
(assuming you need it across sessions). Be careful, I wouldn't persist it
via the to
Hi,
I am also interested in this problem.
Regards
Jason
On 4/28/06, trupti mulajkar <[EMAIL PROTECTED]> wrote:
>
> hi
>
> i am trying to implement the vector space model for lucene.
> i did find some code for generating the vectors, but can any1 suggest a
> better
> way of creating the IndexRead
hi
i am trying to implement the vector space model for lucene.
i did find some code for generating the vectors, but can any1 suggest a better
way of creating the IndexReader object as it is the only way that can return
the index created.
cheers,
trupti mulajkar
MSc Advanced Computer Science
-
Hi, I hava an application that need mark the retrieved documents which have
been read. So the next time I needn't read the marked documents again.
I have an idea that adding a particular field into the indexed
document. But as lucene have no update method, I have to delete that
document, and
What is the best way to prevent a phrase query such as "eggs white"
matching "fried eggs\nwhite snow"?
Two possibilities I have thought about:
1. Replace all line breaks with a special string, e.g. "newline".
2. Have an analyzer somehow increment the position of a term for each line
break it e
22 matches
Mail list logo