the "explain" method on a Searcher, and the Explanation classes can
explain everything baout how/why a particular document in a particular
index gets a particular score for a particular search. The only tricky
thing about it is understanding that it refers to the "raw" scores (what
you seem to be
: The mailing list has already answered this question dozens of times. I've
: been wondering lately, does this list have a FAQ? If so, is this question on
: it?
The wiki is open to editing by all.
Here are a couple choice threads related to this topic which should be of
interest both to the th
: My question is: If I just want to update the small fields in one index
: and do not want to update the large fields in another index, how can I
: make sure these two indexes are synchronized and have the same document
: number?
the short answer: build them in the same order, use the exact same
Compass use a trick to manage father-son indexation.
If you index "collection", with a fields Date, wich are the newest
picture inside, and putting all picture's keyword to it collection?
Then, with a keyword search, you will find the collection with the
most tag occurence number and date s
On 15 Jun 2007, at 19:07, Walt Stoneburner wrote:
Antoine Baudoux writes:
I want to be able to give a score to each collection.
Keep in mind, Lucene is computing a score based on quite a number of
things from how often a term is used in a document, how often it
appears in the collection of doc
Well maybe i didnt explain my problem very well. I have a database
with over 3 million images, with each image belonging to one out of
300 possible collections. A query could return more than 100.000
images (for example if they search for a popular image keyword).
I want to sort my result
Walt explain differently what I said.
Lucene can be efficiently use for selecting objects, without sorting
or scoring anything, then, with id stored in Lucene, you can sort
yourself with a simple Sortable implementation.
The only limit is that lucene gives you not too much results, with
your
On Friday 15 June 2007 03:07, Antony Sequeira wrote:
> Hi
> I am aware that with Lucene I can not do negative only queries such as
> -foo:bar
>
> But today I ran into an issue where I realized even queries such as
> +foo:bar +(-goobly:doo)
> also never return any results.
Could you try this:
Your examples are a little confusing to read. However, I think one thing
that you need to know is that the score (by "default") depends on more
than just the number of hits. It also depends on the length of the
document the hits are in. For example, matching two words in a
two-word-long documen
Antoine Baudoux writes:
I want to be able to give a score to each collection.
Keep in mind, Lucene is computing a score based on quite a number of
things from how often a term is used in a document, how often it
appears in the collection of documents, how long the query is, etc.
If your concep
Your need is :
>From a request you find images
from images you get collections
collections are sorted
collections are returned
you've got a lot of images, and 300 collections
right?
Antoine Baudoux a écrit :
> I am very sorry, but i dont understand at all what you mean in
> terms of Lucene
I am very sorry, but i dont understand at all what you mean in terms
of Lucene api. Could you drop a few lines of concrete code to help me
understand? I'm quite new to lucene.
Thanks!
You sort only "collection", wich are 300.
first step, you search query with lucene
Map collecs wich com
Hi,
Another possibility is to re-think this a bit. You are "displaying
documents one page at a time", which I take to mean you
are displaying some number (say 50) document summaries
per page.
I'm also assuming that you want to display ALL documents
from, say, collection 32 and then (and only th
Thanks Antony for the idea.
The only thing that may prevent it from working well is that the index
is updated frequently so the docid to ext id or cache needs to be
updated freq, which may affect the performance.
Thanks again for your help.
Antony Bowesman wrote:
yu wrote:
Thanks Sawan for
Another possibility is to re-think this a bit. You are "displaying
documents one page at a time", which I take to mean you
are displaying some number (say 50) document summaries
per page.
I'm also assuming that you want to display ALL documents
from, say, collection 32 and then (and only then) di
You sort only "collection", wich are 300.
first step, you search query with lucene
Map collecs wich come from any persisted stuff.
Collection implement Sortable.
Set bags = new HashSet();
iterate over hit
bags.add(collecs.get(hit.getTheIdOfTheCollection));
you've got a bag with at most 300 elemen
The problem is that i want lucene to do the sorting, because the
query qould return thousands of results, and I'm displaying documents
one page at a time.
--
Antoine Baudoux
Development Manager
[EMAIL PROTECTED]
Tél.: +32 2 333 58 44
GSM: +32 499 534 538
Fax.: +32 2 648 16 53
On 15 Jun 2007,
First step is to feed a Set with "collection"
Second step is to sort it.
With a sortedSet, you can do that, isnt'it?
M.
Antoine Baudoux a écrit :
> Could-you be more precise? I dont understand what you mean.
>
>
>
> On 15 Jun 2007, at 17:20, Mathieu Lecarme wrote:
>
>> Your request seems to be
Could-you be more precise? I dont understand what you mean.
On 15 Jun 2007, at 17:20, Mathieu Lecarme wrote:
Your request seems to be a two steps query.
First step, you select image, and then collection
Second step, you sort collection.
BitVector can help you?
M.
Antoine Baudoux a écrit :
Your request seems to be a two steps query.
First step, you select image, and then collection
Second step, you sort collection.
BitVector can help you?
M.
Antoine Baudoux a écrit :
> Hi,
>
> I'm developping an image database. Each lucene document
> representing an image contains (among ot
From my perspective, this is an irrelevant question. The real question
is "is Lucene indexing fast enough for my application?". Which nobody
can answer for you, you have to experiment.
If you're building an index that's only updated every 6 months,
Lucene is certainly "fast enough". If you're re
On 6/14/07, Renaud Waldura <[EMAIL PROTECTED]> wrote:
Thank you for this crystal-clear explanation Mark!
> Are you sure you need a PhraseQuery and not a Boolean
> query of Should clauses?
Excellent question. What's the requirement, hey? Well, the requirement is
to
find documents referring to "
Hi,
I'm developping an image database. Each lucene document representing
an image contains (among other fields ):
- a date field
- a collection field containing the ID of the collection the image
belongs to.
I want to be able to give a score to each collection. Collecti
Hi Antony,
Antony Sequeira wrote:
> In the attached test file I am using string queries and showing the
> failure case.
The attachment didn't make it for some reason.
> Basically I get the impression that I can not have a clause like
> +(-x:y) anywhere in my query.
What follows assumes that the
Daniel Noll wrote:
> On Friday 15 June 2007 11:07:25 Antony Sequeira wrote:
>> Hi
>> I am aware that with Lucene I can not do negative only queries such as
>> -foo:bar
>
> The mailing list has already answered this question dozens of times. I've
> been wondering lately, does this list have a F
Begin forwarded message:
From: J Aaron Farr <[EMAIL PROTECTED]>
Call for Papers Opens for OS Summit Asia 2007
The call for papers is now open for OS Summit Asia, to be held
November 26-30 at the Cyberport in Hong Kong. This joint conference
between the Apache Software Foundation and the Ecl
yu wrote:
Thanks Sawan for the suggestion.
I guess this will work for statically known doc ids. In my case, I know
only external ids that I want to exclude from the result set.for each
search. Of course, I can always exclude these docs in a post search
process. I am curious if there are oth
It's better to first understand what's the computation difference
between Lucene Indexing and database insertiong.
For Lucene Indexing need to stem all words out, sort them, save them
to disk. And since Lucene is an incremental merge model, saved
documents may need to merge and saved again. There
Thanks Sawan for the suggestion.
I guess this will work for statically known doc ids. In my case, I know
only external ids that I want to exclude from the result set.for each
search. Of course, I can always exclude these docs in a post search
process. I am curious if there are other more eff
Hi, I’m a new user to Lucene, and heard that it is a powerful tool for full
text search and I’m planning to use it in my project for data storage
purpose. Before the implementation, I could like to know whether there is
performance issue on Lucene indexing process. I have no doubt on the
retrievin
30 matches
Mail list logo