best if we
> can
> support other models also!
>
> Finally I think there is something to be said for Lucene's default
> retrieval
> model, which in my (non-english) findings across the board isn't terrible
> at
> all... then again I am working with languages
t; Note: I have no bias against BM-25, but its definitely a myth to say there
> is a single retrieval formula that is the 'best' across the board.
>
>
> On Tue, Feb 16, 2010 at 1:53 PM, JOAQUIN PEREZ IGLESIAS <
> joaquin.pe...@lsi.uned.es> wrote:
>
>> By the w
t;
>>
>>
>> --- On Tue, 2/16/10, Robert Muir wrote:
>>
>>> From: Robert Muir
>>> Subject: Re: BM25 Scoring Patch
>>> To: java-user@lucene.apache.org
>>> Date: Tuesday, February 16, 2010, 11:36 AM
>>> yes Ivan, if possible p
t;> Date: Tuesday, February 16, 2010, 11:36 AM
>> yes Ivan, if possible please report
>> back any findings you can on the
>> experiments you are doing!
>>
>> On Tue, Feb 16, 2010 at 11:22 AM, Joaquin Perez Iglesias
>> <
>> joaquin.pe...@lsi.uned.es&
Hi Ivan,
You shouldn't set the BM25Similarity for indexing or searching.
Please try removing the lines:
writer.setSimilarity(new BM25Similarity());
searcher.setSimilarity(sim);
Please let us/me know if you improve your results with these changes.
Robert Muir escribió:
Hi Ivan, I've seen
-BM25/
Best Regards.
José Ramón Perez Aguera wrote:
Hi Grant,
Our query expansion approach is quite simple, we apply
pseudo-relevance feedback techniques, where a number of top retrieved
documents are used to extract the terms candidates to expand the
original query. We have used
necessaries for query expansion. On the other
hand, to implement BM25, we have used the implementation propoused by
Joaquin perez, where avg. Length is computed in indexing time and it
is used as a constant in query time.
We know that this is not the best way to do that, but we don't
Is there an analyzer that can work with XML? Any suggestions for such?
-arturo
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hi all,
Which type of query should I use for the following type of thing.
I have multiple words/phrases. I want to run a search for them all OR'd
together. But I want the documents with the most distinct matches to have
the highest score.
An example. I want to search for "TOM OR DICK OR HARRY
Daniel Naber danielnaber.de> writes:
> On Montag 15 Mai 2006 14:54, Franz Coriand wrote:
> > is it possible not only to get the document which contains the words of
> > a query, but also get the position in the text of the query word?
>
> Yes, by using the term vectors with positions that were ad
yzer is eating numbers?
tia,
arturo
>
> On Apr 7, 2006, at 10:45 PM, Perez wrote:
>
> > Hi all,
> >
> > I have a document with a date in it and I put it into a field like so:
> > DateTools.dateToString(theDate, Resolution.DAY),
> > Field.Index.UN_TOKEN
Hi all,
I have a document with a date in it and I put it into a field like so:
DateTools.dateToString(theDate, Resolution.DAY),
Field.Index.UN_TOKENIZED.
What I find is that a range query works:
[20060131 TO 20060601] and wildcard works e.g.
2006*
but exact matches do not work e.g.
20060130
Any
12 matches
Mail list logo