Not quite sure what you ask for, but I think you want to use a span
near query (for adding boost to phrases) in a disjunction max query
(to define weights of the different fields).
karl
1 okt 2009 kl. 02.40 skrev mitu2009:
Hi,
I've 3 records in Lucene index.
Record 1 contains healt
Hi,
I've got 5 records in Lucene index.
a.Record 1 contains--tax analysis.Date field value is March 2009
b.Record 2 contains--Senior tax analyst.Date field value is Aug 2009
c.Record 3 contains--Senior tax analyst.Date field value is July 2009
d.Record 4 contains--tax analyst.Date field value
I have a question about the reopen functionality in Lucene 2.9. As I
understand it, since FieldCaches are now per-segment, it can avoid reloading
everything when the index is reopened, and instead just load the new
segments.
For background, like many people we have a distributed architecture wher
Hello,
I have created a custom Tokenizer and am trying to set and extract my own
positions for each Token using:
reusableToken.reinit(word.getWord(),tokenStart,tokenEnd);
later when querying my index using a SpanTermQuery the start() and end()
tags don't correspond to these values but seem to co
Hi Mike,
The first thing that comes to mind is to run a query for each document
type (assuming that you have a field that stores the type) and qualify
the document type: for example type:pdf. Then you would have to write
something to combine the query results drawing an equal number of hits
Hi Mike,
I'd simply store a field "doctype" with values "pdf", "txt", "html"
and perform a separate search for each type. Although, I'd be
interested if anyone has a cooler way of doing this.
Cheers,
Phil
On Thu, Oct 1, 2009 at 9:56 AM, Michael Masters wrote:
> I was wondering if there is any w
Thanks, I will try NumberRangeQuery
On Thu, Oct 1, 2009 at 4:01 PM, Grant Ingersoll wrote:
>
> On Sep 29, 2009, at 11:30 AM, Dragan Jotanovic wrote:
>
>> Hi, I was thinking a long time how to implement this kind of
>> functionality but couldn't figure out anything appropriate.
>> In my lucene doc
I was wondering if there is any way to control what kind of documents
are returned from a search. For example, lets say we have an index
built from different types of documents (pdf, txt, html, etc.). Is
there a way to have the first x results have a specified distribution
of document types? It wou
Andrzej Bialecki wrote:
Hi all,
I'm happy to announce the new release of Luke - the Lucene Index Toolbox.
There's a bug in this version in that it doesn't show TermVectors for a
field. I'll fix it in a few days - I'm waiting for other potential bugs
to show up. So if you find something that
On Sep 29, 2009, at 11:30 AM, Dragan Jotanovic wrote:
Hi, I was thinking a long time how to implement this kind of
functionality but couldn't figure out anything appropriate.
In my lucene document, I have two date fields: start and end date.
As a search input I have current date (NOW).
I need t
Felipe Lobo wrote:
> Here's the code:
> --
> Highlighter highlighter = new Highlighter(new SimpleHTMLFormatter(), new
> QueryScorer(query));
>
> highlighter.setTextFragmenter(new SimpleFragmenter(9));
>
> String fieldName = "Title";
>
> St
Here's the code:
--
Highlighter highlighter = new Highlighter(new SimpleHTMLFormatter(), new
QueryScorer(query));
highlighter.setTextFragmenter(new SimpleFragmenter(9));
String fieldName = "Title";
String text = document.getField(fieldN
Felipe Lobo wrote:
> Hi, thanks for the answer but it didn't work.
> I stopped rewriting the query and used the queryscorer but it don't
> highlight.
> The part of the query i'm doing wildcard is the number part, like this:
> "HC 100930027253"
> The HC is hightlighted but the numbers aren't:
> "Ha
Hi,
In an attempt to balance searching efficiency against the number of open file
descriptors on my system, I cache IndexSearchers with a "last used" timestamp.
A background cache manager thread then periodically checks the cache for any
that haven't been used in a while and removes them from
Hi Anshum,
That is exactly the same code he is using (only that he does not instantiate
the collector; IndexSearcher.search(query, int) does exactly that
internally :-)
His problem was, that if offset+limit is large or Integer.MAX_VALUE that he
runs out of memory.
-
Uwe Schindler
H.-H.-Meie
@Christian : Which version of Lucene are you using?
For lucene 2.9 this would work.
*__code snippet__*
IndexReader r = IndexReader.open("/home/anshum/index/indexname", true);
IndexSearcher s = new IndexSearcher(r);
QueryParser qp = new QueryParser("testfield",new StopAnalyzer());
Query q = qp.par
I forgot to mention: Because of this, e.g. even Google (who do not use
Lucene :-]) does not let you go beyond a limit to a very large page number.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Uwe Schin
On Thu, Oct 1, 2009 at 8:21 AM, iron light wrote:
> The reason is I wanna dig deeply.
OK :) That's fun!
> I just read the code. And found that the index namespace (IndexWriter!) in
> so tough for me.
> Is there any document, resource or blog about the code?
In general there's no separate doc
Hi Chris,
> Uwe,
>
> > You are using TopDocs incorrectly. Normally you use *not*
> Integer.MAX_VALUE,
> > as the upper bound of your pagination window as numer of documents. So
> if
> > user wants to display documents 90 to 100, just set the number to 100
> docs.
> > If the user then goes to docs
Hi, thanks for the answer but it didn't work.
I stopped rewriting the query and used the queryscorer but it don't
highlight.
The part of the query i'm doing wildcard is the number part, like this:
"HC 100930027253"
The HC is hightlighted but the numbers aren't:
"Habeas Corpus HC 100930027253 ES 10
Thanks, Mike.
The reason is I wanna dig deeply.
I just read the code. And found that the index namespace (IndexWriter!) in
so tough for me.
Is there any document, resource or blog about the code?
IL
On Thu, Oct 1, 2009 at 8:53 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> It's b
But a collector will not output the documents in sorted order...
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Anshum [mailto:ansh...@gmail.com]
> Sent: Thursday, October 01, 2009 1:58 PM
> To: java-use
Anshum,
> You could get the hits in a collector and pass the sort to the
> collector as it would be the collect function that handles the
> sorting.
>
> searcherObject.search(query,collector);
>
> Hope that gives you some headway. :)
Not quite (yet?) ;-)
What do you mean by passing the Sort t
Hey Christian,
Try what I wrote in the last reply. Would work absolutely fine. Have tested
that for very large datasets.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Thu,
It's better to use the TermEnum API (IndexReader.terms()) to step
through the terms, than to directly access the raw file (unless you
have some reason to do so...).
Mike
On Wed, Sep 30, 2009 at 6:29 AM, iron light wrote:
> I try to traverse all the term text in one tis files. And it failed. the
Uwe,
> You are using TopDocs incorrectly. Normally you use *not* Integer.MAX_VALUE,
> as the upper bound of your pagination window as numer of documents. So if
> user wants to display documents 90 to 100, just set the number to 100 docs.
> If the user then goes to docs 100 to 110, just reexecute t
Can you turn on IndexWriter's infoStream and post the resulting output?
Enabling calibrateSizeByDeletes doesn't automatically mean that
segments with many deletes will be merged. EG if your mergeFactor is
high relative to the number of segments you have at each level, then
no merging will take pl
You could get the hits in a collector and pass the sort to the collector as
it would be the collect function that handles the sorting.
searcherObject.search(query,collector);
Hope that gives you some headway. :)
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here b
Hallo Chris,
You are using TopDocs incorrectly. Normally you use *not* Integer.MAX_VALUE,
as the upper bound of your pagination window as numer of documents. So if
user wants to display documents 90 to 100, just set the number to 100 docs.
If the user then goes to docs 100 to 110, just reexecute t
Hello everybody,
I'm looking at quite an interesting challenge right now, so I
hope that somebody out there will be able to assist me.
What I'm trying to do is returning search results both sorted and
paginated. So far I haven't been able to come up with a working solution.
Pagination without so
Per segment over many segments is actually a bit faster for none sort
cases and many sort cases -but an optimized index will still be
fastest - the speed benifit of many segments comes when reopening - so
say for realtime search - in that case you may want to sac the opt
perf for a segment
Hey there,
Until now when using Lucene 2.4 I was always optimizing my index using
compound file after updating it. I was doing that because if not I could
feel a lot performance loss in search responses.
Now in Lucene 2.9 there are per segment readers and I have read something
about it performes b
Hi all,
I've a problem about using IndexWriter#deleteDocuments to delete more
then one document at once.
the following is my code:
Try 1:
StringBuffer query_values = new StringBuffer();
query_values.append(UNIQUEID_FIELD_NAME);
query_values.append(":(");
33 matches
Mail list logo