Hi,
I've 3 records in Lucene index.
Record 1 contains healthcare in title field. Record 2 contains healthcare
and insurance in description field but not together. Record 3 contains
healthcare insurance in company name field.
When a user searches for healthcare insurance,I want to show records i
Excuse the cross-posting and gratuitous marketing :)
Erik
My company, Lucid Imagination, is sponsoring a free and in-depth
technical webinar with Erik Hatcher, one of our co-founders as Lucid
Imagination, as well as co-author of Lucene in Action, and Lucene/Solr
PMC member and com
Scott Smith wrote:
> I've been looking at the changes I have to make in my code to go from
> 2.4.1 to 2.9. One of the features I have is to highlight query hits in
> documents which meet the search criteria. If the query has a phrase,
> then I need to highlight the phrase, but not isolated words
Felipe Lobo wrote:
> Hi, i updated my lucene lib to 2.9.0 and i'm trying to instanciate the
> spanscorer but the constructor is protected.
> I looked in the javadoc of lucene and saw 2 subclasses of it
> (PayloadNearQuery.PayloadNearSpanScorer,
> PayloadTermQuery.PayloadTermWeight.PayloadTermSpanSc
Hi, i updated my lucene lib to 2.9.0 and i'm trying to instanciate the
spanscorer but the constructor is protected.
I looked in the javadoc of lucene and saw 2 subclasses of it
(PayloadNearQuery.PayloadNearSpanScorer,
PayloadTermQuery.PayloadTermWeight.PayloadTermSpanScorer).
Using this classes is
I've been looking at the changes I have to make in my code to go from
2.4.1 to 2.9. One of the features I have is to highlight query hits in
documents which meet the search criteria. If the query has a phrase,
then I need to highlight the phrase, but not isolated words from the
phrase which also
About clear(Object sentinel) - is it still a question
no, it is not. Makes no sense with mutable elements :)
- Original Message
> From: Shai Erera
> To: java-user@lucene.apache.org
> Sent: Wednesday, 30 September, 2009 21:02:19
> Subject: Re: TSDC, TopFieldCollector & co
>
> I was h
I was half way through answering the second part when I noticed your second
update :).
I don't know about adding reset() to Collector. It makes sense "for
completeness" in case other Collectors can be reset() as well. But reset()
is a delicate method. It needs to be used cautiously. E.g., if you a
forget the question about initialize(), reading javadoc before asking already
answered questions helps a lot, sorry for the noise. ...NOTE in
getSentinelObject() javadoc...
- Original Message
> From: eks dev
> To: java-user@lucene.apache.org
> Sent: Wednesday, 30 September, 2009 20:
> BTW eks, you asked about reusing TSDC.
yeah, it is normally not a big deal to allocate everything again, but these
arrays are not necessarily small, I guess it would make sense to open this
possibility.
do you think where would be better to add reset(), TSDC or to Collector?
I would even s
Way the heck better - Hits is horrible for that. It caches like 100 hits
and then keeps searching when you exhaust the cache (been I while since
I've looked at the exact numbers). Its horribly inefficient for checking
every hit.
Hits will end up using a Collector anyway - and then throw a speed tr
Thanks Mark that's exactly what I need. How does the performance of
processing each document in the collect method of HitCollector compare to
looping through the Hits in the deprecated Hits class?
On Tue, Sep 29, 2009 at 7:40 PM, Mark Miller wrote:
> Max Lynch wrote:
> > Hi,
> > I am developing
BTW eks, you asked about reusing TSDC. PQ has a clear() method, so it can be
reused. Only currently it's final and nullifies the array. We'll need to
un-final it, and then override in HitQueue to just reset the ScoreDoc
instances to be sentinels again. And of course add a reset() method to TSDC.
O
Thanks Mark, Shai,
I was getting confused by so many possibilities to do the "almost the same
thing" ;)
But have figured it out by peeking into BoolenQuery code that decides if "out
of order" should be used..., BQ will pick the right TSDC ... I like it, option
1 it is minimum user code.
Cheers
Thanks Mark, Shai,
I was getting confused by so many possibilities to do the "almost the same
thing" ;)
But have figured it out by peeking into BoolenQuery code that decides if "out
of order" should be used..., BQ will pick the right TSDC ... I like it, option
1 it is minimum user code.
Cheers
Hello,
I am in the process of trying out the lucene patch LUCENE-1634,
however I'm not getting the expected behavior.
I see that the segments are not getting merged even after all the
documents are deleted from it.
Because of this, the index size really grows to a huge number. The
expec
I agree. If you need sort-by-score, it's better to use the "fast" search
methods. IndexSearcher will create the appropriate TSDC instance for you,
based on the Query that was passed.
If you need to create multiple Collectors and pass a kind of Multi-Collector
to IndexSearcher, then you should crea
If you want relevance sorting (Sort.Score not Sort.Relevance right?),
I'd think you want to use TopScoreDocCollector, not TopFieldCollector.
The only reason to use relevance with TopFieldCollector is if you you
are doing a nth sort with a field sort as well.
You don't really need to worry about th
and another question, is it somehow possible to reuse TopScoreDocCollector
instance?
Javadoc in create(...) warns about allocating full array.
NOTE: The instances returned by this method
* pre-allocate a full array of length
* numHits, and fill the array with sentinel
* objects.
I try to traverse all the term text in one tis files. And it failed. the
code is below.
Does I misunderstand something?
The source code (especial the index namespace) is very complicated for me.
Is there any more document about the design and something can help me
understand the source?
Thanks.
Hi All,
What is the best way to achieve the following and what are the differences, if
I say "I do not normalize scores, so I do not need max score tracking, I do not
care if hits are returned in doc id order, or any other order. I need only to
get maxDocs *best scoring* documents":
OPTION 1:
You could look in to modifying the standard tokenizer lexer code to
handle punctuation (there is a patch in the isssue tracker for the old
javacc grammer to handle punctuation) and there is also the Gate NLP
project which has a fairly nice sentence splitter you might find
useful. Add a whol
Robert Muir wrote:
try checking out PerFieldAnalyzerWrapper, so you can specify how each field
is handled, i.e. some fields with KeywordAnalyzer, other fields with
StandardAnalyzer, etc.
Thanks, yes actually I realize these fields do need some analysis
because I want to the search to be case ins
23 matches
Mail list logo