Hi,
Norms are inputs to the score that are independent from the query. It
is typically computed as a function of the number of terms of a
document: the more terms, the higher the normalization factor and the
lower the score.
Lucene computes and indexes length normalization factors automatically
f
I just want to know what is norms in lucene 4.10.4.
How to implement norms in a program.
What are their types.
What is the difference between boost and norms?
Sample programs on norms
can see how a document was scored internally in lucene
> > given a query.
> >
> > I see that the IndexSearcher has an explain
> > <
> https://lucene.apache.org/core/8_0_0/core/org/apache/lucene/search/IndexSearcher.html#explain-org.apache.lucene.search.Que
ucene.apache.org/core/8_0_0/core/org/apache/lucene/search/IndexSearcher.html#explain-org.apache.lucene.search.Query-int->
> method
> available that returns an Explanation
> <https://lucene.apache.org/core/8_0_0/core/org/apache/lucene/search/Explanation.html>
> object. An Expl
rch/IndexSearcher.html#explain-org.apache.lucene.search.Query-int->
method
available that returns an Explanation
<https://lucene.apache.org/core/8_0_0/core/org/apache/lucene/search/Explanation.html>
object. An Explanation object only contains a description field (string)
but there is no way to know what par
Hi!
I think the explanation is:
my previous tests were on an index of a db table of 300 documents; small
index = 1 segment and doc at index level is equal to docID at
LeafReaderContext.
Anyway it is not recommanded to use doc as id of document, it is reserved
for Lucene use.
Next step, my index
ONDON) <
skothar...@bloomberg.net> wrote:
> Hello,
>
> Currently the Explanation class has a toHtml and toString method, but it
> doesn't seem to have any method to output the explanation as a NamedList.
> Part of the reason is that NamedList is a Solr only concept and it can
Hello,
Currently the Explanation class has a toHtml and toString method, but it
doesn't seem to have any method to output the explanation as a NamedList. Part
of the reason is that NamedList is a Solr only concept and it cannot exist in
lucene. But this leads to utility functions like
(
flexible enough for you then you will need
> to implement a custom query.
>
>
> > 2. I'm reasearching Elasticsearch/Lucene capabilities. Elastichsearch
> > contains request parameter "explain" that uses Lucene's Explanation class
> > under the hood. But this
ichsearch
> contains request parameter "explain" that uses Lucene's Explanation class
> under the hood. But this class covers only scoring aspects. I would like to
> include matching logic details there. It seems a good place but this class
> is final..
>
Well, you can al
ot;explain" that uses Lucene's Explanation class
under the hood. But this class covers only scoring aspects. I would like to
include matching logic details there. It seems a good place but this class
is final..
Regards,
Vadim Gindin
Hello Lucene developers and users. I'm currently researching
Elasticsearch/Lucene capabilites.
I'd like to extend an information that Explanation class provides. This
class currently provides only score computation for document or query.
Particularly I'd like to include the follow
Thank you.
I read and got to understand ByteBlockPool.
A chain of slices can sit on a byte block, with these slices linked by forward
addresses.
This suggests that multiple logical chains of slices can co-exist on the same
byte block, with their slices interleaving on the byte block.
In which
Why on earth would you start with this crazy Lucene class ;)
Alas I don't think there's any additional documentation for it, and it is
really hairy.
It's basically a simplistic filesystem, where each unique term we've seen
(for one in-memory segment) is a "file", and we don't know when we first
s
I am studying Lucene internals.
TermsHashPerField is the core to invert a field, which is so complex a class
for me to understand.
Is there any detailed articles / materials available to explain
TermsHashPerField's internals?
Best Wishes,
Yijian
lr - Nutch
From: "rolaren...@earthlink.net"
To: java-user@lucene.apache.org
Sent: Thursday, February 19, 2009 10:40:52 AM
Subject: Re: newbie seeking explanation of semantics of "Field" class
Thanks to Erick, Matthew, and Uwe -- that does help, a lot. E.g., one bit of
Thanks to Erick, Matthew, and Uwe -- that does help, a lot. E.g., one bit of
code I had (mostly copied) now makes more sense:
// add this field, to allow retrieving the full-text:
myDocument.add(new Field("contents", theFullDocumetText, Field.Store.COMPRESS,
Field.Index.NO));
// add this fiel
This confused me on my first encounter, but it all makes
sense after a while
The first thing to understand is that Store and Index are
orthogonal.That is, when you index a field that data
is placed in the inverted index and is searchable, whether
or not you store it. But it is not retrievable
Hi Paul,
> I have copied some code and it is working for me, but I am a little
> uncertain how to decide what value of Field.Index and Field.Store to
> choose in order to get the behavior I'd like. If I read the javadocs, and
> decide to ignore all the "expert" items, it looks like this:
>
> Fiel
Comments inline:
rolaren...@earthlink.net wrote:
R2.4
I have been looking through the soon-to-be-superseded (by its 2nd ed.) book "Lucene In Action" (hope it's ok on this newsgroup to say I like that book); also at these two tutorials: http://darksleep.com/lucene/ and http://www.informit.com/ar
R2.4
I have been looking through the soon-to-be-superseded (by its 2nd ed.) book
"Lucene In Action" (hope it's ok on this newsgroup to say I like that book);
also at these two tutorials: http://darksleep.com/lucene/ and
http://www.informit.com/articles/article.aspx?p=461633&seqNum=3 and also a
: even mention that possibility. When I debug through the call, I find the
: "explanation" in this code inside class MarkupContainsQuery (which is
: the code that gets called):
...
: // TODO SY - implement
: >>>>>>
> R2.4
>
> There is much about Lucene that I do not understand, so it may be that there
> is some simple or obvious mistake I am making. I build an index, get hits
> (documents) back from it, with various non-zero scores. Now I call this code:
>
>Explanation expl = _
R2.4
There is much about Lucene that I do not understand, so it may be that there is
some simple or obvious mistake I am making. I build an index, get hits
(documents) back from it, with various non-zero scores. Now I call this code:
Explanation expl = _searcher.explain(rewrite
That worked perfectly.
Thanks alot!
Sincerely,
Chris Salem
- Original Message -
To: java-user@lucene.apache.org
From: Erick Erickson
Sent: 12/22/2008 5:00:51 PM
Subject: Re: lucene explanation
Warning! I'm really reaching on this
But it seems you could use TermDocs/TermEn
query
> matched with the document. Right now I use the Explanation object, here's
> the code:
> int len = hits.length();
> if(len > 50) len = 50;
> for(int i=0; i Explanation ex = searcher.explain(Query.parse("resume_text:(query)"),
> hits.id(i));
> if(ex.
)
For each hit (up to 50) I'd like to find out which part of the query matched
with the document. Right now I use the Explanation object, here's the code:
int len = hits.length();
if(len > 50) len = 50;
for(int i=0; i
Sure; here are the two explanations (below). Your question made me go look
at the explanation more carefully again and (no) surprise, I discovered
that I
misspoke (miswrote) earlier; the two "found" terms are j2ee and soa,
which then makes my "concern" much less of one,
Donna L Gresh skrev:
I have two slightly different queries,
Hi Donna,
I can't help you, but perhaps I would understand everthing better if you
also pasted in the explanations.
karl
-
To unsubscribe, e-mail: [EMAIL P
j2ee"^2.0, text:"soa"^2.0, text:webservic
In this case there are three boosted terms and one unboosted term. Note
that now both db2 and soa are "boosted".
The score is 0.065, which is slightly smaller, which is the opposite of
what I would expect, since I have two boosted te
ents of doc1 and doc2 are what
> you expect. You can even run queries through it (but watch to insure
> that you're using the correct analyzer) and see what is returned
>
> Best
> Erick
>
> On Nov 27, 2007 3:54 PM, Ng Vinny <[EMAIL PROTECTED]> wrote:
>
> > Hi
ED]> wrote:
>
> > Hi all,
> >
> > I am having a problem with Lucene 2.2.0 with regard to the contents of
> the
> > Explanation objects after a PhraseQuery search. I indexed two documents
> doc1
> > and doc2 and then issue an OR Boolean query consisting of
eries through it (but watch to insure
that you're using the correct analyzer) and see what is returned
Best
Erick
On Nov 27, 2007 3:54 PM, Ng Vinny <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> I am having a problem with Lucene 2.2.0 with regard to the contents of the
>
Hi all,
I am having a problem with Lucene 2.2.0 with regard to the contents of the
Explanation objects after a PhraseQuery search. I indexed two documents doc1
and doc2 and then issue an OR Boolean query consisting of two PhraseQuery
pq1 and pq2.
Apparently, the details of the Explanation object
Oh, duh! Of course it is. I've done that before.
Thanks Daniel.
John G.
-Original Message-
From: Daniel Naber [mailto:[EMAIL PROTECTED]
Sent: Friday, November 23, 2007 5:52 PM
To: java-user@lucene.apache.org
Subject: Re: Explanation
On Samstag, 24. November 2007, John Griffin
On Samstag, 24. November 2007, John Griffin wrote:
> System.out.println(indexSearcher.explain(query,
> counter).toString());
I think you need to use hits.id() instead of counter.
Regards
Daniel
--
http://www.danielnaber.de
-
Is there a problem with the term frequency count (tf) and the
IndexSearcher.explain method? I'm searching the following string (fieldname
is description) for the term 'salesman' and receive the accompanying
explanation from IndexSearcher.explain(.). I've highlighted
And it was as easy as all that...
Thanks.
- Original Message
From: Chris Hostetter <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, April 6, 2007 12:23:30 PM
Subject: Re: Explanation from FunctionQuery
: So we reach a problem at extractTerms. I get an explanat
: So we reach a problem at extractTerms. I get an explanation no problem
...
: I'm using the version of FunctionQuery from the JIRA attachment.
that seems like the heart of the problem ... i haven't looked at the
version in Jira for a while, but the version commited into
ublic Explanation explain(Query query, int doc) throws IOException {
return explain(createWeight(query), doc);
}
I've verified that I am in fact calling the explain method that I think I am.
Here's the full stack trace:
java.lang.UnsupportedOperationE
actly what hte problem is...
: The ms is a MultiSearcher. I read that
...this is the implemnetation for MultiSearcher...
public Explanation explain(Weight weight,int doc) {
throw new UnsupportedOperationException();
}
...it's got nothing to do with FunctionQuery
I'm hoping someone can offer some insight into the FunctionQuery. I've just
discovered this, and I think it's exactly what I've been looking for, but I'm
having some trouble getting it to work. I can create and execute the query, but
if I try to see
son [mailto:[EMAIL PROTECTED]
Sent: Wednesday, February 07, 2007 8:19 PM
To: java-user@lucene.apache.org
Subject: Re: Counting term frequency without using Explanation
Before you go too far down this path, please consider what a "hit" is.
It's
more complicated than you think .
If a
to the threads you have mentioned?
Thanks,
Harini
-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED]
Sent: Wednesday, February 07, 2007 8:19 PM
To: java-user@lucene.apache.org
Subject: Re: Counting term frequency without using Explanation
Before you go too far down this
t
> as well, like this
>
> doc10.8333 websphere=3, Java = 2
> doc20.817websphere=2, Java=2
>
>
> I already tried to implement with TermFreqVector, but TermFreqVector
> show
> all the
> terms in the field, instead what I want is only the t
per
document
as well, like this
doc10.8333 websphere=3, Java = 2
doc20.817websphere=2, Java=2
I already tried to implement with TermFreqVector, but TermFreqVector
show
all the
terms in the field, instead what I want is only the terms that happen in
the
query.
I
ppen in the
query.
I already tried using TermDocs as well, but it always gave result 0.
I tried using Explanation class, using toString method, but I have to
"clean"
the information.
Is there any "direct" way to do this in Lucene ? Or perhaps someone can
give me a hint ?
Thanks in advance
: on using Lucene but info for the internal workings of Lucene is hard to
: come by.
As with many OS code bases: the code is the documentation.
: 1) I'm using the default QueryParser to parse and return a query so it's
: a Boolean-OR query. So does this mean it uses the DisjunctionSumScorer
: or
Thanks, Chris for your clear explanations, it seems there are a lot info
on using Lucene but info for the internal workings of Lucene is hard to
come by.
I got some more questions which I'll ask in-line.
Chris Hostetter wrote:
: Since i'm using a boolean OR query i figured it must be related
: Since i'm using a boolean OR query i figured it must be related to the
: BooleanScorer (though there's a more complicated BooleanScorer2 which
: I'm not sure when it's use).
There's actually three possible scorers used: ConjunctionScorer can be
used if all of the clauses are required. Most of
hod i see that it takes in a float for
: freq instead of int. So i'm curious to see how this method is invoked.
I commented on this recently (and no one contested my explanation)...
http://www.nabble.com/Similarity-Usage%3A-tf%28int%29-vs-tf%28float%29-p2981283.html
-Hoss
---
what I do)
: For example, looking at the tf method i see that it takes in a float for
: freq instead of int. So i'm curious to see how this method is invoked.
I commented on this recently (and no one contested my explanation)...
http://www.nabble.com/Similarity-Usage%3A-tf%28int%29-vs-tf%
Thanks, for posting the "more like this" code. I just began coding my
cosine similarity and need some help. Can anyone tell me in which file
are the methods of the DefaultSimilarity methods called?
For example, looking at the tf method i see that it takes in a float for
freq instead of int.
Eugene wrote:
Any good links on extending the similarity class? A lot of posts
discusses David Spencer's "More Like This" but i can;t find this anywhere.
The "More Like This" code can be found here:
http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/contrib/similarity/
--
I was wondering if anyone has any idea how i can start to implement my
own similarity. I wanna use the cosine similarity measure instead. I was
looking through the past forums posts and saw that quite a few people
have also discussed this, but no real method of doing it was mentioned.
Any good
: I was looking at the new 1.9 api and can't seem to find this expert mode
: of searching.
yonik's refering to all of the methods in the Searcher class that have
"Expert" in their (javadoc) description.
:
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/IndexSearcher.html#search(
I was looking at the new 1.9 api and can't seem to find this expert mode
of searching.
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/IndexSearcher.html#search(org.apache.lucene.search.Weight,%20org.apache.lucene.search.Filter,%20org.apache.lucene.search.HitCollector)
Can you te
On 3/3/06, Eugene <[EMAIL PROTECTED]> wrote:
> Just one more question: Any way in which i can disable this normalization?
We disabled this normalization for in Lucene 1.9 for the "expert"
level search methods on IndexSearcher. Use the search methods that
don't return Hits.
-Yonik
--
ote:
On 3/3/06, Eugene <[EMAIL PROTECTED]> wrote:
Hi Yonik,
Thanks a lot, I think i understand how explanation works better now.
But, there's something weird I noticed. I've a query like:
"problem formulation each possible x probability p x y find x p x y
maximized how com
ote:
On 3/3/06, Eugene <[EMAIL PROTECTED]> wrote:
Hi Yonik,
Thanks a lot, I think i understand how explanation works better now.
But, there's something weird I noticed. I've a query like:
"problem formulation each possible x probability p x y find x p x y
maximized how com
On 3/3/06, Eugene <[EMAIL PROTECTED]> wrote:
> Hi Yonik,
>
> Thanks a lot, I think i understand how explanation works better now.
>
> But, there's something weird I noticed. I've a query like:
> "problem formulation each possible x probability p x y fin
Hi Yonik,
Thanks a lot, I think i understand how explanation works better now.
But, there's something weird I noticed. I've a query like:
"problem formulation each possible x probability p x y find x p x y
maximized how compute p x y"
The weird thing is that li
On 3/2/06, Eugene Ezekiel <[EMAIL PROTECTED]> wrote:
> Thanks Yonik for the reply. I got just a couple more questions,
>
> 1) Why does the explanantion print so many times?
Because it was a compound query with multiple parts to it. It's one explanation
with multiple parts
t; wrote:
> > Hi All,
> >
> > I'm not sure how to interpret the result of the toString method of
> > Explanation. I'm trying to see the values of each component of the
> > Default Similarity formula for a particular query and a doc. Given
> > below is a samp
t;
> I'm not sure how to interpret the result of the toString method of
> Explanation. I'm trying to see the values of each component of the
> Default Similarity formula for a particular query and a doc. Given
> below is a sample of my Explanation output. Many thanks if anyon
Hi All,
I'm not sure how to interpret the result of the toString method of
Explanation. I'm trying to see the values of each component of the
Default Similarity formula for a particular query and a doc. Given
below is a sample of my Explanation output. Many thanks if anyone c
66 matches
Mail list logo