date:20091001

Re: Help needed ordering search results

2009-10-01 Thread Karl Wettin

Not quite sure what you ask for, but I think you want to use a span near query (for adding boost to phrases) in a disjunction max query (to define weights of the different fields). karl 1 okt 2009 kl. 02.40 skrev mitu2009: Hi, I've 3 records in Lucene index. Record 1 contains healt

Help needed bubbling up relevant records with most recent date

2009-10-01 Thread mitu2009

Hi, I've got 5 records in Lucene index. a.Record 1 contains--tax analysis.Date field value is March 2009 b.Record 2 contains--Senior tax analyst.Date field value is Aug 2009 c.Record 3 contains--Senior tax analyst.Date field value is July 2009 d.Record 4 contains--tax analyst.Date field value

Efficiently reopening remotely-distributed indexes in 2.9?

2009-10-01 Thread Nigel

I have a question about the reopen functionality in Lucene 2.9. As I understand it, since FieldCaches are now per-segment, it can avoid reloading everything when the index is reopened, and instead just load the new segments. For background, like many people we have a distributed architecture wher

TermPositions with custom Tokenizer

2009-10-01 Thread Christopher Tignor

Hello, I have created a custom Tokenizer and am trying to set and extract my own positions for each Token using: reusableToken.reinit(word.getWord(),tokenStart,tokenEnd); later when querying my index using a SpanTermQuery the start() and end() tags don't correspond to these values but seem to co

Re: document diversity

2009-10-01 Thread Tricia Williams

Hi Mike, The first thing that comes to mind is to run a query for each document type (assuming that you have a field that stores the type) and qualify the document type: for example type:pdf. Then you would have to write something to combine the query results drawing an equal number of hits

Re: document diversity

2009-10-01 Thread Phil Whelan

Hi Mike, I'd simply store a field "doctype" with values "pdf", "txt", "html" and perform a separate search for each type. Although, I'd be interested if anyone has a cooler way of doing this. Cheers, Phil On Thu, Oct 1, 2009 at 9:56 AM, Michael Masters wrote: > I was wondering if there is any w

Re: Filtering on two date fields simultaneously

2009-10-01 Thread Dragan Jotanovic

Thanks, I will try NumberRangeQuery On Thu, Oct 1, 2009 at 4:01 PM, Grant Ingersoll wrote: > > On Sep 29, 2009, at 11:30 AM, Dragan Jotanovic wrote: > >> Hi, I was thinking a long time how to implement this kind of >> functionality but couldn't figure out anything appropriate. >> In my lucene doc

document diversity

2009-10-01 Thread Michael Masters

I was wondering if there is any way to control what kind of documents are returned from a search. For example, lets say we have an index built from different types of documents (pdf, txt, html, etc.). Is there a way to have the first x results have a specified distribution of document types? It wou

Re: [ANN] Luke 0.9.9 release

2009-10-01 Thread Andrzej Bialecki

Andrzej Bialecki wrote: Hi all, I'm happy to announce the new release of Luke - the Lucene Index Toolbox. There's a bug in this version in that it doesn't show TermVectors for a field. I'll fix it in a few days - I'm waiting for other potential bugs to show up. So if you find something that

Re: Filtering on two date fields simultaneously

2009-10-01 Thread Grant Ingersoll

On Sep 29, 2009, at 11:30 AM, Dragan Jotanovic wrote: Hi, I was thinking a long time how to implement this kind of functionality but couldn't figure out anything appropriate. In my lucene document, I have two date fields: start and end date. As a search input I have current date (NOW). I need t

Re: Implement SpanScorer on 2.9 lucene lib!

2009-10-01 Thread Mark Miller

Felipe Lobo wrote: > Here's the code: > -- > Highlighter highlighter = new Highlighter(new SimpleHTMLFormatter(), new > QueryScorer(query)); > > highlighter.setTextFragmenter(new SimpleFragmenter(9)); > > String fieldName = "Title"; > > St

Re: Implement SpanScorer on 2.9 lucene lib!

2009-10-01 Thread Felipe Lobo

Here's the code: -- Highlighter highlighter = new Highlighter(new SimpleHTMLFormatter(), new QueryScorer(query)); highlighter.setTextFragmenter(new SimpleFragmenter(9)); String fieldName = "Title"; String text = document.getField(fieldN

Re: Implement SpanScorer on 2.9 lucene lib!

2009-10-01 Thread Mark Miller

Felipe Lobo wrote: > Hi, thanks for the answer but it didn't work. > I stopped rewriting the query and used the queryscorer but it don't > highlight. > The part of the query i'm doing wildcard is the number part, like this: > "HC 100930027253" > The HC is hightlighted but the numbers aren't: > "Ha

How to test if an IndexReader is still open?

2009-10-01 Thread Chris Bamford

Hi, In an attempt to balance searching efficiency against the number of open file descriptors on my system, I cache IndexSearchers with a "last used" timestamp. A background cache manager thread then periodically checks the cache for any that haven't been used in a while and removes them from

RE: Pagination and Sorting

2009-10-01 Thread Uwe Schindler

Hi Anshum, That is exactly the same code he is using (only that he does not instantiate the collector; IndexSearcher.search(query, int) does exactly that internally :-) His problem was, that if offset+limit is large or Integer.MAX_VALUE that he runs out of memory. - Uwe Schindler H.-H.-Meie

Re: Pagination and Sorting

2009-10-01 Thread Anshum

@Christian : Which version of Lucene are you using? For lucene 2.9 this would work. *__code snippet__* IndexReader r = IndexReader.open("/home/anshum/index/indexname", true); IndexSearcher s = new IndexSearcher(r); QueryParser qp = new QueryParser("testfield",new StopAnalyzer()); Query q = qp.par

RE: Pagination and Sorting

2009-10-01 Thread Uwe Schindler

I forgot to mention: Because of this, e.g. even Google (who do not use Lucene :-]) does not let you go beyond a limit to a very large page number. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Uwe Schin

Re: How does the term infos file (.tis) works?

2009-10-01 Thread Michael McCandless

On Thu, Oct 1, 2009 at 8:21 AM, iron light wrote: > The reason is I wanna dig deeply. OK :) That's fun! > I just read the code. And found that the index namespace (IndexWriter!) in > so tough for me. > Is there any document, resource or blog about the code? In general there's no separate doc

RE: Pagination and Sorting

2009-10-01 Thread Uwe Schindler

Hi Chris, > Uwe, > > > You are using TopDocs incorrectly. Normally you use *not* > Integer.MAX_VALUE, > > as the upper bound of your pagination window as numer of documents. So > if > > user wants to display documents 90 to 100, just set the number to 100 > docs. > > If the user then goes to docs

Re: Implement SpanScorer on 2.9 lucene lib!

2009-10-01 Thread Felipe Lobo

Hi, thanks for the answer but it didn't work. I stopped rewriting the query and used the queryscorer but it don't highlight. The part of the query i'm doing wildcard is the number part, like this: "HC 100930027253" The HC is hightlighted but the numbers aren't: "Habeas Corpus HC 100930027253 ES 10

Re: How does the term infos file (.tis) works?

2009-10-01 Thread iron light

Thanks, Mike. The reason is I wanna dig deeply. I just read the code. And found that the index namespace (IndexWriter!) in so tough for me. Is there any document, resource or blog about the code? IL On Thu, Oct 1, 2009 at 8:53 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > It's b

RE: Pagination and Sorting

2009-10-01 Thread Uwe Schindler

But a collector will not output the documents in sorted order... - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Anshum [mailto:ansh...@gmail.com] > Sent: Thursday, October 01, 2009 1:58 PM > To: java-use

Re: Pagination and Sorting

2009-10-01 Thread Christian Robert

Anshum, > You could get the hits in a collector and pass the sort to the > collector as it would be the collect function that handles the > sorting. > > searcherObject.search(query,collector); > > Hope that gives you some headway. :) Not quite (yet?) ;-) What do you mean by passing the Sort t

Re: Pagination and Sorting

2009-10-01 Thread Anshum

Hey Christian, Try what I wrote in the last reply. Would work absolutely fine. Have tested that for very large datasets. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Thu,

Re: How does the term infos file (.tis) works?

2009-10-01 Thread Michael McCandless

It's better to use the TermEnum API (IndexReader.terms()) to step through the terms, than to directly access the raw file (unless you have some reason to do so...). Mike On Wed, Sep 30, 2009 at 6:29 AM, iron light wrote: > I try to traverse all the term text in one tis files. And it failed. the

Re: Pagination and Sorting

2009-10-01 Thread Christian Robert

Uwe, > You are using TopDocs incorrectly. Normally you use *not* Integer.MAX_VALUE, > as the upper bound of your pagination window as numer of documents. So if > user wants to display documents 90 to 100, just set the number to 100 docs. > If the user then goes to docs 100 to 110, just reexecute t

Re: Results of setting LogMergePolicy "calibrateSizeByDeletes=true"

2009-10-01 Thread Michael McCandless

Can you turn on IndexWriter's infoStream and post the resulting output? Enabling calibrateSizeByDeletes doesn't automatically mean that segments with many deletes will be merged. EG if your mergeFactor is high relative to the number of segments you have at each level, then no merging will take pl

Re: Pagination and Sorting

2009-10-01 Thread Anshum

You could get the hits in a collector and pass the sort to the collector as it would be the collect function that handles the sorting. searcherObject.search(query,collector); Hope that gives you some headway. :) -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here b

RE: Pagination and Sorting

2009-10-01 Thread Uwe Schindler

Hallo Chris, You are using TopDocs incorrectly. Normally you use *not* Integer.MAX_VALUE, as the upper bound of your pagination window as numer of documents. So if user wants to display documents 90 to 100, just set the number to 100 docs. If the user then goes to docs 100 to 110, just reexecute t

Pagination and Sorting

2009-10-01 Thread Christian Robert

Hello everybody, I'm looking at quite an interesting challenge right now, so I hope that somebody out there will be able to assist me. What I'm trying to do is returning search results both sorted and paginated. So far I haven't been able to come up with a working solution. Pagination without so

Re: Lucene 2.9 and performance of readers per segment.

2009-10-01 Thread Mark Miller

Per segment over many segments is actually a bit faster for none sort cases and many sort cases -but an optimized index will still be fastest - the speed benifit of many segments comes when reopening - so say for realtime search - in that case you may want to sac the opt perf for a segment

Lucene 2.9 and performance of readers per segment.

2009-10-01 Thread Marc Sturlese

Hey there, Until now when using Lucene 2.4 I was always optimizing my index using compound file after updating it. I was doing that because if not I could feel a lot performance loss in search responses. Now in Lucene 2.9 there are per segment readers and I have read something about it performes b

Why it doesn't work about IndexWriter deleteDocuments

2009-10-01 Thread Bon

Hi all, I've a problem about using IndexWriter#deleteDocuments to delete more then one document at once. the following is my code: Try 1: StringBuffer query_values = new StringBuffer(); query_values.append(UNIQUEID_FIELD_NAME); query_values.append(":(");

Re: Help needed ordering search results

Help needed bubbling up relevant records with most recent date

Efficiently reopening remotely-distributed indexes in 2.9?

TermPositions with custom Tokenizer

Re: document diversity

Re: document diversity

Re: Filtering on two date fields simultaneously

document diversity

Re: [ANN] Luke 0.9.9 release

Re: Filtering on two date fields simultaneously

Re: Implement SpanScorer on 2.9 lucene lib!

Re: Implement SpanScorer on 2.9 lucene lib!

Re: Implement SpanScorer on 2.9 lucene lib!

How to test if an IndexReader is still open?

RE: Pagination and Sorting

Re: Pagination and Sorting

RE: Pagination and Sorting

Re: How does the term infos file (.tis) works?

RE: Pagination and Sorting

Re: Implement SpanScorer on 2.9 lucene lib!

Re: How does the term infos file (.tis) works?

RE: Pagination and Sorting

Re: Pagination and Sorting

Re: Pagination and Sorting

Re: How does the term infos file (.tis) works?

Re: Pagination and Sorting

Re: Results of setting LogMergePolicy "calibrateSizeByDeletes=true"

Re: Pagination and Sorting

RE: Pagination and Sorting

Pagination and Sorting

Re: Lucene 2.9 and performance of readers per segment.

Lucene 2.9 and performance of readers per segment.

Why it doesn't work about IndexWriter deleteDocuments

33 matches

Site Navigation

Mail list logo

Footer information