t this in the query
> construction.
I think requiring n terms at the same position would need a slop of 1-n,
and I'd like to have some test cases added for that.
Now if I only had some time...
Regards,
Paul Elschot
>
> thanks,
>
> C>T>
> On Tue, Nov 24, 2009 at 9:17 AM, Chr
when spans at the same positions are considered
ordered.
Did I understand correctly that the unordered case with a slop of -1
and without the edit works to match terms at the same position?
In that case it may be worthwhile to add that to the javadocs,
and also add a few testcases.
Regards,
Paul El
y like to be able to do arbitrary span
> searches where tokens may be at the same position and also in other
> positions where the ordering of subsequent terms may be restricted as per
> the normal span API.
My pleasure,
Paul Elschot
>
> thanks,
>
> C>T>
>
> On Sun
w can I join several such filters together?
There are various ways. OpenBitSet and OpenBitSetDISI can do this,
and there's also BooleanFilter and ChainedFilter in contrib.
> Using FieldCacheTermsFilter sounds promising. Fortunately it is a single
> value field (our unique doc id).
Regards,
P
Try a MultiTermQueryWrapperFilter instead of the QueryFilter.
I'd expect a modest gain in performance.
In case it is possible to form a few groups of terms that are reused,
it could even be more efficient to also use a CachingWrapperFilter
for each of these groups.
Regards,
Paul Elscho
e too
much to only match at the same position.
SpanNearQuery may or may not work for a slop of -1, but one could try
that for both the ordered and unordered cases.
One way to do that is to start from the existing test cases.
Regards,
Paul Elschot
>
> Regards,
> Adriano Crestani
compatibility for minor version numbers
> (e.g. v3.5 will be compatible with v3.2)
> B) best effort drop-in back compatibility for the next minor version
> number only, and deprecations may be removed after one minor release
> (e.g. v3.3 will be compat with v3.2, but not v3.4)
I'd prefer B), with a minimum period of about two months to the
next release in case it removes deprecations.
Regards,
Paul Elschot
ed, for example by using the ones with the best
query score.
Limiting the number of terms would also be good, but that less easy.
Regards,
Paul Elschot
>
> Chris
>
> 2009/10/12 Paul Elschot
>
> > Chris,
> >
> > You could also store term vectors for all docs a
Chris,
You could also store term vectors for all docs at indexing
time, and add the termvectors for the matching docs into a
(large) map of terms in RAM.
Regards,
Paul Elschot
On Monday 12 October 2009 21:30:48 Christoph Boosz wrote:
> Hi Jake,
>
> Thanks for your helpful explanat
the matching documents by using a counting
HitCollector on the IndexSearcher.
Regards,
Paul Elschot
As long as next(), skipTo(), doc() and score() on a Scorer work,
the search will be done. I hope the results are correct in this
case, but I'm not sure.
Regards,
Paul Elschot
On Wednesday 15 July 2009 19:08:00 Michael McCandless wrote:
> I don't think a toplevel BS2 is able to
happening.
Eks, could you try a toString() on the top level scorer for one of the affected
queries to see whether it shows BS2 on top level and BS for the inner scorers?
Regards,
Paul Elschot
>
> BooleanQuery only uses BooleanScorer when there are no required terms,
> and allowDocs
It is also possible to use the HitCollector api and simply ignore
the score values.
Regards,
Paul Elschot
On Saturday 04 July 2009 21:14:41 Mark Harwood wrote:
>
> Check out booleanfilter in contrib/queries. It can be wrapped in a
> constantScoreQuery
>
>
>
> On 4 Jul
there might be of help during interactive retrieval. Your application
is not really a web shop, but there are (at least) some overlaps.
Regards,
Paul Elschot
On Monday 04 May 2009 19:16:10 Christian Bongiorno wrote:
> I am trying to build a search (have been experimenting with using Lucene)
>
test and test.
> As a side note, Will the Shingle Filter help me getting all possible
> combination of the input tokens?
I don't know.
Regards,
Paul Elschot
different weights in SpanTermQuery.
Regards,
Paul Elschot
On Friday 17 April 2009 12:18:46 Radhalakshmi Sreedharan wrote:
> To make the question simple,
>
> What I need is the following :
> If my document field is ( ab,bc,cd,ef) and Search tokens are (ab,bc,cd).
>
> Given the
On Thursday 09 April 2009 21:56:44 Andy wrote:
> Is there a way to have lucene to write index in a txt file?
No. You could try a hexdump of the index file(s), but that isn't
really human readable. Instead of that you may want to try Luke:
http://www.getopt.org/luke/
Regards,
Paul Elschot
t,
and by a heap.
For the time being, Lucene does not have a low level facility for key values
that occur at most once per document field, so for these it normally helps
to use a Filter.
Regards,
Paul Elschot
I'm using
> ParallelMultiSearcher so I'm not even 100% sure that I know what index
> each Hit is located in.
It's the other way around: for span queries a search result is created
(internally, by SpanScorer) from the spans resulting from the getSpans()
method above.
Does tha
.
Regards,
Paul Elschot
On Tuesday 17 March 2009 12:35:19 Adrian Dimulescu wrote:
> Ian Lea wrote:
> > Adrian - have you looked any further into why your original two term
> > query was too slow? My experience is that simple queries are usually
> > extremely fast.
> Let
the new TrieRangeQuery:
http://wiki.apache.org/lucene-java/SearchNumericalFields
Regards,
Paul Elschot
when using the same criterion as in the removed methods
there, your original problem might not have occurred at all.
In the CachingWrapperFilter in trunk the choice is left to an overridable
method.
Regards,
Paul Elschot
>
> Regards,
> Raf
>
> On Sun, Feb 15, 2009 at 2:39 PM, P
when it is smaller than
OpenBitSet), please comment at LUCENE-1296.
Regards,
Paul Elschot
On Sunday 08 February 2009 09:47:24 Raffaella Ventaglio wrote:
> Hi Paul,
>
> One way to implement that would be to use one of the boolean combination
> > filters in contrib, BooleanFilter o
On Sunday 08 February 2009 09:53:00 Uwe Schindler wrote:
> I would do so, it's really simple, you can even do it in an anonymous inner
> class.
It is indeed simple, but it might also help to take a look at the source code
of the Lucene classes involved.
Regards,
Paul Elschot
>
ounting.
Could you describe how this compact forwarded index works?
> Similar to FieldCache idea but more compact.
Does this also use FieldCacheRangeFilter and/or FieldCacheTermsFilter?
Regards,
Paul Elschot
ng counts it uses
> even 2GB of memory (and this is very bad).
50.000 facets? Well, in case the performance of the last suggestion is
not good enough, one could try and implement a better data structure
than OpenBitSet and SortedVIntList to provide a DocIdSetIterator,
preferably with a fast skipTo() and possibly with a fast intersection count.
In that case, you may want to ask further on the java-dev list.
Regards,
Paul Elschot
sue and mention the performance
improvements?
Regards,
Paul Elschot
>
> -John
>
> On Thu, Jan 8, 2009 at 1:27 AM, Paul Elschot wrote:
>
> > John,
> >
> > Continuing, see below.
> >
> > On Wednesday 07 January 2009 14:24:15 Paul Elschot wrote:
> &g
John,
Continuing, see below.
On Wednesday 07 January 2009 14:24:15 Paul Elschot wrote:
> On Wednesday 07 January 2009 07:25:17 John Wang wrote:
> > Hi:
> >
> >The default buffer size (for docid,score etc) is 32 in TermScorer.
> >
> > We have a large i
help, but not for AND queries.
See also LUCENE-430 on reducing buffer sizes for the underlying
TermDocs for very sparse doc sets.
Regards,
Paul Elschot
tribute to the score.
One might consider the scoring of the optional clauses to be an
implementation of the extended Boolean model.
Fuzzy searching is implemented by constructing a Boolean query
with optional (and actually present) terms that are similar enough to
the fuzzy query term.
Regards,
P
hat further caching
> could be done apart from the default caching which lucene does.
More caching is probably not going to help.
Regards,
Paul Elschot
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
lthough I understand the idea behind
> the setting, I am not sure why it made a difference in my case.
That option chooses another algorithm to search these queries, it
will only affect queries without required terms.
(The change in search algorithm is from BooleanScore
y
lucene's OpenBitSet. Also have a look at earlier discussions
on the subject: you might find a good use for OpenBitSetDISI and
contrib/**/{BooleanFilter,ChainedFilter}.
Regards,
Paul Elschot
Op Tuesday 09 December 2008 07:44:20 schreef Michael Stoppelman:
> Hi all,
>
> I'm w
to show an unexpected tradeoff possibility
opened by the new Filter api.
I don't know whether you followed LUCENE-584 (Decouple Filter
from BitSet), but a contribution like this multi range filter makes
it all worthwhile.
Regards,
Paul Elschot
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
need to
> use it directly.
Is this part of the problem
https://issues.apache.org/jira/browse/LUCENE-1296
?
Also consider o.a.l.util.OpenBitSetDISI, and how that is used in
contrib/queries/**/BooleanFilter
Regards,
Paul Elschot
-
range boolean query.
>
> Mike, Paul, I'm happy to contribute this (ugly but working) code if
> there is interest. Let me know and I'll open a JIRA issue for it.
In case you think more performance improvements based on this
are possible...
Regards,
Paul Elschot.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
sk,
> we could do this packing during index such that loading at search
> time is very fast.
Perhaps we'd better continue this at LUCENE-1231 or LUCENE-1410.
I think what you're referring to is PDICT, which has frame exceptions
for values that occur infrequently.
Regards,
Paul Elsc
gt;
> However that'd be quite a bit deeper change to Lucene.
The cheap version is hierarchical prefixing here:
http://wiki.apache.org/jakarta-lucene/DateRangeQueries
Regards,
Paul Elschot
-
To unsubscribe, e-mail: [EMAI
ructure in the cache. (Sparse enough means
less than 1 in 8 of all docs available the index reader.)
See also LUCENE-1296 for caching another data structure than the
one used to collect the filtered docs.
Regards,
Paul Elschot
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Tim,
I didn't follow all the details, so this may be somewhat off,
but did you consider using TermVectors?
Regards,
Paul Elschot
Op Monday 10 November 2008 19:18:38 schreef Tim Sturge:
> Yes, that is a significant issue. What I'm coming to realize is that
> either I will end u
PlaceAnd() is not optimal, although it
should work just fine. A patch for a performance
improvement will follow.
Regards,
Paul Elschot
>
> Cheers
> Mark
>
>
> -
> To unsubscribe, e
/queries/**/BooleanFilter
Regards,
Paul Elschot
Op Saturday 08 November 2008 19:06:15 schreef Timo Nentwig:
> Hi!
>
> Since Filter.bits() is deprecated and replaced by getDocIdSet() now I
> wonder how I am supposed to combine (AND) filters (for facets).
>
> I worked around this
ator superclass):
public abstract int estimatedDocFreq();
and implement this for all existing instances. TermScorer could
implement it without estimating.
For AND/OR/NOT such an estimation is straightforward but for
proximity queries it would be more of a guess.
Regards,
Paul Elschot
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Op Friday 05 September 2008 16:57:34 schreef Mark Miller:
> Paul Elschot wrote:
> > Op Thursday 04 September 2008 20:39:13 schreef Mark Miller:
> >> Sounds like its more in line with what you are looking for. If I
> >> remember correctly, the phrase query factors i
and idf is not used for scoring Spans.
The reason why idf is not used could be that there is no basic
score value associated with inner spans; only top level spans
are scored by SpanScorer.
For more details, please consult the SpanScorer code.
Regards,
Paul Elschot
>
> - Mark
>
&g
Op Saturday 30 August 2008 18:22:50 schreef Matt Ronge:
> On Aug 30, 2008, at 6:13 AM, Paul Elschot wrote:
> > Op Saturday 30 August 2008 03:34:01 schreef Matt Ronge:
> >> Hi all,
> >>
> >> I am working on implementing a new Query, Weight and Scorer that
> &g
Op Wednesday 03 September 2008 18:06:57 schreef Matt Ronge:
> On Aug 30, 2008, at 3:01 PM, Paul Elschot wrote:
> > Op Saturday 30 August 2008 18:19:09 schreef Matt Ronge:
> >> On Aug 30, 2008, at 4:43 AM, Karl Wettin wrote:
> >>> Can you tell us a bit more ab
Op Saturday 30 August 2008 18:22:50 schreef Matt Ronge:
> On Aug 30, 2008, at 6:13 AM, Paul Elschot wrote:
> > Op Saturday 30 August 2008 03:34:01 schreef Matt Ronge:
> >> Hi all,
> >>
> >> I am working on implementing a new Query, Weight and Scorer that
> &g
filtering at all, because it already uses skipTo() where possible.
In case you are looking for documents that contain partial phrases
from an input query that has more than 2 words, have a look at Nutch.
Regards,
Paul Elschot
>
>
> --
> Matt
>
> >> Hi all,
> >>
&
erates on?
Yes, Filters.
> Or should I just implement something myself in a custom scorer?
In case you have a better way than skipTo(), or something
to improve on this issue to allow a Filter as clause to BooleanQuery:
https://issues.apache.org/j
Op Thursday 24 July 2008 23:00:33 schreef Robert Stewart:
> Queries are very complex in our case, some have up to 100 or more
> clauses (over several fields), including disjunctions and prohibited
> clauses.
Other than the earlier advice, did you try setAllowDocsOutOfOrder() ?
Rega
r docs and these score values. Then use this as the
scorer for a new Query, via a Weight.
Once this new Query is available, just add it as required to a
BooleanQuery.
Regards,
Paul Elschot
-
To unsubscribe, e-mail: [EMAIL PROTECTE
positions
SpanScorer will also need to be extended or even replaced.
In case you want to continue this discussion, please do so
on java-dev.
Regards,
Paul Elschot.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
#x27;t know. The Spans interface does not contain a weight() or
score() method, so there is no way to pass such information
to SpanScorer.
Regards,
Paul Elschot
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
gt;> (often 1000) of constituent TermQueries. I'm wondering if there is
> >> a better way to do this?
> >> I'm open to implementing my own Query subclass if I can expect
> >> significant performance improvements from doing this.
Does BooleanQuery.setAllowDocsOut
Op Sunday 18 May 2008 16:30:26 schreef Karl Wettin:
> 18 maj 2008 kl. 00.01 skrev Paul Elschot:
> > Op Saturday 17 May 2008 20:28:40 schreef Karl Wettin:
> >> As far as I know Lucene only handle single word synonyms at index
> >> time. My life would be much simple
or the synonym. Was this one of the workarounds?
The advantage of the zero position increment is that the original
token positions are not affected, so at least there is no influence
on scoring because of changes in the original token positions.
Regards,
Paul Elschot
-
ed)
and/or/phrase/span) make sure that the subscore values are
combined into another value that has the same theoretical
maximum.
Have a look here to start:
https://issues.apache.org/jira/browse/LUCENE-293
Regards,
Paul Elschot
-
To
iltered case.
> I guess your suggested solution is my best option without changing
> the way getSpans works (which I'm not going to change any time soon )
Before doing that, have a look at the code of SpanWeight/SpanScorer,
ConjunctionScorer, and the filtering code in IndexSearcher.
Regards,
P
internally but I guess that if the
> filter is known beforehand,
A Filter needs to make a BitSet available before the query search.
> it could speed things up quite a bit.
I would expect a substantial speedup from using skipTo() on the
Spans when only 0.1% of the results passes the fi
Op Tuesday 06 May 2008 17:39:38 schreef Paul Elschot:
> Eran,
>
> Op Tuesday 06 May 2008 10:15:10 schreef Eran Sevi:
> > Hi,
> >
> > I am looking for a way to filter a SpanQuery according to some
> > other query (on another field from the one used for the SpanQu
use spans.start() and spans.end() here
// ...
more = spans.next();
}
if (! more) {
break;
}
filterDoc = bits.nextSetBit(spans.doc());
}
Please check the javadocs of java.util.BitSet, there may
be a 1 off error in the arguments to nextSetBit().
Regards,
Paul Elschot
>
> I tried looking
arSpansOrdered class in
the org.apache.lucene.search.spans package to allow
a match for less than all subqueries. This is not going to
be straightforward, but it is possible. In case you choose
this last option, please continue on the java-dev list.
Regards,
Paul Elschot
>
> On Fri, Apr 4, 2008
mer. I had really convinced myself till the
> thought came to me at lunch :).
For a single query, adding a filter off course has a cost.
But when the location part can be reused in later queries,
give CachingWrapperFilter a try.
Regards,
Paul Elschot
>
> -M
>
> On Wed, Apr 16, 2008
Op Saturday 12 April 2008 00:03:13 schreef Antony Bowesman:
> Paul Elschot wrote:
> > Op Friday 11 April 2008 13:49:59 schreef Mathieu Lecarme:
> >> Use Filter and BitSet.
> >> From the personnal data, you build a Filter
> >> (http://lucene.apache.org/jav
ne.apache.org/java/2_3_1/api/org/apache/lucene/search/Fil
>ter.html) wich is used in the main index.
With 1 billion mails, and possibly a Filter per user, you may want to
use more compact filters than BitSets, which is currently possible
in the development trunk of lucene.
Regards,
Paul Elschot
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
is no specific reason why it cannot be done, one only needs
to provide the corresponding tokenizer to be used at indexing time.
Kind regards,
Paul Elschot
>
> Itamar.
>
> -Original Message-
> From: Paul Elschot [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, April 08, 2008 1:5
Itamar,
Have a look here:
http://lucene.apache.org/java/2_3_1/scoring.html
Regards,
Paul Elschot
Op Tuesday 08 April 2008 00:34:48 schreef Itamar Syn-Hershko:
> Paul and John,
>
> Thanks for your quick reply.
>
> The problem with query rewriting is the beforementioned
>
match
a document, as long as at least one matches.
For the required query parts (AND like), Scorer.skipTo()
is used, and that could well be the filter mechanism you
are referring to; have a look at the javadocs of Scorer,
and, if necessary, at the actual code of ConjunctionScorer.
Regards,
Paul
more data to the lucene index that can be used to reduce
the number of results to be fetched.
Regards,
Paul Elschot
Op Wednesday 26 March 2008 13:51:24 schreef Shailendra Mudgal:
> > The bottom line is that reading fields from docs is expensive.
> > FieldCache will, I believe, lo
reason, retrieving docs is best done in doc id
order, but that is unlikely to go wrong as doc ids are normally
collected in increasing order.
Regards,
Paul Elschot
Op Tuesday 25 March 2008 13:43:18 schreef Shailendra Mudgal:
> Hi Everyone,
>
> We are using Lucene to search on a index
Op Saturday 22 March 2008 00:32:32 schreef Paul Elschot:
> Milu,
>
> This is a PHP problem, not a Lucene one, so you might get better
> response at a PHP mailing list.
>
> The easy way around your problem is probably by invoking a shell
> script from php that exports
, you'll probably want to use the PHP/Java extension
to avoid initializing a JVM for each call to lucene. Try this:
http://www.google.nl/search?q=php+java+org+apache+lucene&ie=UTF-8&oe=UTF-8
This was one of the results:
http://www.idimmu.net/index.php?blog%5Bpagenum%5D=3
Regards,
Paul
approach?
Have a look at Searcher.explain()
Regards,
Paul Elschot
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
irstQuery, BooleanClause.Occur.MUST); //must is
> like an AND
> overallquery.add(secondQuery, BooleanClause.Occur.MUST):
There is no need for a QueryParser in this case when using a
TermQuery instead of a Query for q1, q2, q3 and q4:
TermQuery q1 = new TermQuery(new Term("title", "ter
already have a firm requirement for that case?
SpanNotQuery can be used to prevent matches over paragraph
borders when these are indexed as such, but I would not expect
that you would need those, given the fuzzyness of the [10/5/2].
Regards,
Paul Elschot
Op Friday 15 February 2008 09:45:58 schreef
avoid disjunctions. For example for verbs, one could
index only the stem and use a payload for the actual inflected
form (singular/plural, past/present, first/second/third person, etc).
Regards,
Paul Elschot
>
> Cedric
>
>
> On Fri, Feb 15, 2008 at 7:15 AM, Paul Elschot <[EM
revert to using another field for different position info.
Regards,
Paul Elschot
Op Thursday 14 February 2008 09:44:40 schreef Cedric Ho:
> Hi Paul,
>
> Sorry I am not sure I understand your solution.
>
> Because I would need to apply this scoring logic to all the different
> types
y on this extra field would almost do, and you will
probably need https://issues.apache.org/jira/browse/LUCENE-1093 .
This will be somewhat slower than using a payload, because the search
will be done in two separate fields, but it will work.
Regards,
Paul Elschot
--
return TopDocs. From this one
can make a precision/recall graph for the query by considering
the total results higher than a given score.
When a lot of such computations are needed, you may also want
to cache the values of a unique identifier field for all indexed docs,
have a look at Field
ing the results.
Regards,
Paul Elschot
Op Friday 08 February 2008 05:48:08 schreef Nilesh Bansal:
> Hi,
>
> I want to create a function, which takes in a query string (in lucene
> syntax), and a string as content and returns back if the query matches
> the content or not. This wou
Op Tuesday 29 January 2008 03:32:08 schreef Daniel Noll:
> On Friday 25 January 2008 19:26:44 Paul Elschot wrote:
> > There is no way to do exact phrase matching on OCR data, because no
> > correction of OCR data will be perfect. Otherwise the OCR would have made
>
he contrib area. It has truncation and proximity based on span
queries,
but no fuzzy term matching, so it could also be a start for investigating.
It all depends on how good the OCR was, but in some cases (think old paper)
it's just not possible to do good OCR.
Regards,
Paul Elschot
-
for all
terms
in the query, a separate scorer will be used during query search.
The query rewrite could in principle do this, but it might affect the score
values.
Regards,
Paul Elschot
-
To unsubscribe, e-mail: [EMAIL PRO
e indexed to allow filtering, and stored to
allow retrieval for filtering in another index. Retrieving stored fields
is normally a performance bottleneck, so a FieldCache might be handy.
Regards,
Paul Elschot
On Thursday 10 January 2008 12:58:44 sachin wrote:
> Here are more details about my i
rers and by Span Scorer.
That is for the case that offsets were meant to be positions
within a document.
It is also possible that offsets were meant in the sense of using
skipTo(doc) instead of next() on a Scorer. This is done during
query search when at least one term is required.
Regards,
Paul Els
that on top of
TermEnum.
The TermEnum starts at a given field/term and iterates through all indexed
terms after that, including terms with field names ordered later than
the given field. That's why the field name must be checked in the Term.
Perhaps that could be another bit functio
On Tuesday 18 December 2007 14:59:45 Peter Keegan wrote:
>
> Should I open a Jira issue?
>
What shall I say?
http://www.apache.org/foundation/how-it-works.html
Regards,
Paul Elschot
-
To unsubscribe, e-mail: [EMAIL
Karl,
This might work for you:
https://issues.apache.org/jira/browse/LUCENE-293
Regards,
Paul Elschot
On Friday 14 December 2007 18:06:01 Karl Wettin wrote:
> I have an index that contains three sorts of documents:
>
> Car brand
> Tire brand
> Tire pressure
>
> (Please b
;foo^0") => returns the same X results even if all scores are 0
In the patch, Matcher is a superclass of Scorer and it does not have the
score() method, so 'matching' is independent of the any score value.
The matchi
Gentlefolk,
Well, the javadocs as patched at LUCENE-584 try to change all
the cases of zero scoring to 'non matching'.
I'm happily bracing for a minor conflict with that patch. In case
someone wants to take another look at the javadocs as
patched there, don't let me stop y
On Tuesday 06 November 2007 23:14:01 Mike Klaas wrote:
> On 29-Oct-07, at 9:43 AM, Paul Elschot wrote:
> > On Friday 26 October 2007 09:36:58 Ard Schrijvers wrote:
> >> +prop1:a +prop2:b +prop3:c +prop4:d +prop5:e
> >>
> >> is much faster than
> >>
>
t.
This Y% is not directly possible, but I would expect the default
document score to correlate reasonably well with coverage.
In case you want an exact Y% cutoff, you'll run into the fact
that the field norm (the inverse square root of the field length)
is encoded in only 8 bits, which is
nding your queries with these
special tokens, for example: "=begin= foo bar dot =end=" .
Regards,
Paul Elschot
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
ding
BooleanQuery.rewrite(). Take care about query weights, though.
Regards,
Paul Elschot
>
> thanks for any help,
>
> Regards Ard
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
and SortedVIntList.
Regards,
Paul Elschot
On Saturday 27 October 2007 02:15:48 Yonik Seeley wrote:
> On 10/26/07, John Patterson <[EMAIL PROTECTED]> wrote:
> > Thom Nelson wrote:
> > > Check out the HashDocSet from Solr, this is the best way to cache small
> > > se
does not work for this because it works on doc level
and not within the matching text of a field.
Regards,
Paul Elschot
On Wednesday 17 October 2007 17:57:21 Dave Golombek wrote:
> We've run into a situation where having "NOT NEAR" queries would really
> help. I hav
for. I was hoping for a cleaner
> approach.
You can try this:
Explanation e = indexSearcher.explain(query, documentId);
and get the score value from the explanation.
Have a look at the code of any Scorer.explain() method on
how to get the score value only. There really is no need to filter
. The reason for that is performance,
BooleanScorer uses a faster data structure than a priority queue,
but BooleanScorer does not implement skipTo().
Regards,
Paul Elschot
On Thursday 04 October 2007 09:12, Dan Rich wrote:
> Hi,
>
> I have a custom Query class that provides a long list
As for suggestions on how to do this, I have no other than
to make sure that you can create the queries necessary to obtain
the required output.
Regards,
Paul Elschot
On Sunday 30 September 2007 09:20, Mohammad Norouzi wrote:
> Hi Paul,
> thanks, I dot your idea, now I am planing to imp
1 - 100 of 317 matches
Mail list logo