Raghavendra Prabhu wrote:
While Indexing, I use a different Analyser
While searching, I use a simple standard Analyzer
Will this prevent me from getting the same best fragments when i do a search
for two terms say term1 and term2
It depends on the differences, but in general you will always g
While Indexing, I use a different Analyser
While searching, I use a simple standard Analyzer
Will this prevent me from getting the same best fragments when i do a search
for two terms say term1 and term2
Rgds
Prabhu
1) An inverted full text index is not a replacment for a relational
database.
2) many people think they need a relational database, when all they really
need is a well designed full text index.
To get to some of your specific questions...
: them in one field). One of the problems I see would b
H,
We have made documents out of the rows in our database and one of the team
is suggesting that we abandon some of our database queries and instead use
lucene. I think there are some fundamental problems with this especially
when it comes to association tables (where there is a 1 one to many
rela
Peter Keegan wrote:
Oops. I meant to say: Does this mean that an IndexSearcher constructed from
a MultiReader doesn't merge the search results and sort the results as if
there was only one index?
It doesn't have to, since a MultiReader *is* a single index.
A quick test indicates that it does
On 4/11/06, Peter Keegan <[EMAIL PROTECTED]> wrote:
> Oops. I meant to say: Does this mean that an IndexSearcher constructed from
> a MultiReader doesn't merge the search results and sort the results as if
> there was only one index?
That's how I answered it.
A single search is done... the "mergin
Oops. I meant to say: Does this mean that an IndexSearcher constructed from
a MultiReader doesn't merge the search results and sort the results as if
there was only one index?
A quick test indicates that it does merge the results properly, however
there is a difference in the order of documents wi
On 4/11/06, Peter Keegan <[EMAIL PROTECTED]> wrote:
> Does this mean that MultiReader doesn't merge the search results and sort
> the results as if there was only one index?
Correct, it doesn't. It supports the lower level primitives like
TermEnum and TermDocs that searches use to run. A term qu
Does this mean that MultiReader doesn't merge the search results and sort
the results as if there was only one index? If not, does it simply
concatenate the results?
Peter
On 4/11/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>
> On 4/11/06, Peter Keegan <[EMAIL PROTECTED]> wrote:
> > Could you
In case anyone misses the smiley, I'm just teasing. Yes, there are
years of research and heavy duty experience that are behind Lucene.
There are quite a number of research documents and books that
describe information retrieval theory and practice, several of them
linked here:
<
On Apr 11, 2006, at 1:46 PM, miki sun wrote:
Is there any theory behind the similarity measure of Lucene?
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/
Similarity.html
No, Doug just made it up with some random mathematical formulas, just
for fun :)
Erik
--
if you use a custom SImilarity class, the tf(float) function is used for
phrases to determine how the score should be determined based on the
number of times the phrase qppears in the documents.
if you make it an identity function, and modify the other functions in the
Similarity to be (mostly) c
Hi there
Is there any theory behind the similarity measure of Lucene?
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html
Thanks
Miki
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional co
It uses a combination of boolean, to get the set of matching
documents, and vector space (by default) to rank them. Or one might
say it uses the vector space model, and only returns nonzero scoring
documents.
On 4/10/06, hu andy <[EMAIL PROTECTED]> wrote:
> I have seen in some documents that ther
Hi,
I am using phraseQuery to get the number of documents that the query
appers in using the hits. I would like to know if there is any way in
which i can get the number of times a phrase appears within each
document.
I am currently using searching for the phrase "avoids deadlock"
phraseQuery q
On 4/11/06, Peter Keegan <[EMAIL PROTECTED]> wrote:
> Could you explain why an IndexSearcher constructed from multiple readers is
> faster than a MultiSearcher constructed from same readers?
The "convergence layer" is a level lower for a MultiReader vs a MultiSearcher.
A MultiReader is an IndexRe
I guess Compass is probably the way to go - http://www.opensymphony.com/compass/
From: Prasenjit Mukherjee [mailto:[EMAIL PROTECTED]
Sent: Tue 4/11/2006 2:45 AM
To: java-user@lucene.apache.org
Subject: Re: Distributed Lucene.. - clustering as a requirement
Agre
Yonik,
Could you explain why an IndexSearcher constructed from multiple readers is
faster than a MultiSearcher constructed from same readers?
Thanks,
Peter
On 4/10/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>
> On 4/10/06, oramas martÃn <[EMAIL PROTECTED]> wrote:
> > Is there any performance
What be way for clusterizations of searching?
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
On Dienstag 11 April 2006 10:33, Nadav Har'El wrote:
> This sort of proximity-influenced scoring is missing from
> Lucene's QueryParser, and I've been wondering recently
> on how it is best to add it, and whether it is possible to
> easily do it with existing Lucene machinary, like the
> SpanQuery
"Maxym Mykhalchuk" <[EMAIL PROTECTED]> wrote on 11/04/2006 11:52:07 AM:
> As for improving multi-word queries, Doug Cutting recently posted a link
to
> his presentation,
> http://www.haifa.ibm.com/Workshops/ir2005/papers/DougCutting-Haifa05.pdf,
> just scroll down to Nutch N-Grams there, and you'l
Hi Nadav,
Thanks for suggestions.
As for improving multi-word queries, Doug Cutting recently posted a link to
his presentation,
http://www.haifa.ibm.com/Workshops/ir2005/papers/DougCutting-Haifa05.pdf,
just scroll down to Nutch N-Grams there, and you'll see the answer.
Basically, "Buffy the V
"Maxym Mykhalchuk" <[EMAIL PROTECTED]> wrote on 10/04/2006 09:46:16 PM:
> Here's the issue: All my "documents" will be having a few (2-3:
> title, short description) short fields. You see, it's rare that the
> same word is repeated several times in a title, so will Lucene be
> able to give me a dec
23 matches
Mail list logo