date:20061009

Re: Incremental updates / slow searches.

2006-10-09 Thread Mathias Lux

Rickard Bäckman wrote: > Hi, > > we are using a search system based on Lucene and have recently tried to add > incremental updating of the index instead of building a new index every now > and then. However we now run into problems as our searches starts to take > very long time to complete. > >

Re: Incremental updates / slow searches.

2006-10-09 Thread Yonik Seeley

On 10/9/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: don't forget to optimize your index every now and then as well... deleting a document just marks it as "deleted" it still gets inspectected by every query during scoring at least once to see that it can skip it, optimizing is the only thing t

Re: Incremental updates / slow searches.

2006-10-09 Thread Chris Hostetter

don't forget to optimize your index every now and then as well... deleting a document just marks it as "deleted" it still gets inspectected by every query during scoring at least once to see that it can skip it, optimizing is the only thing that truely removes the "deleted" documents. : Date: Mo

Re: wildcard and span queries

2006-10-09 Thread Erick Erickson

Doron: Thanks for the suggestion, I'll certainly put it on my list, depending upon what the PM decides. This app is geneaology reasearch, and users *can* put in their own wildcards... This is why I love this list... lots of smart people giving me suggestions I never would have thought of ... Th

Re: FieldSelectorResult instance descriptions?

2006-10-09 Thread Grant Ingersoll

See http://www.gossamer-threads.com/lists/lucene/java-dev/33964? search_string=Lazy%20Field%20Loading;#33964 for the discussion on Java Dev from wayback if you want more background info. To some extent, I still think Lazy Fields are in the early adopter stage, since they haven't officially b

RE: QueryParser syntax French Operator

2006-10-09 Thread Patrick Turcotte

Hi, I was thinking of something along those lines. Last week, I was able to take time to understand the JavaCC syntax and possiblities. I have some cleaning up, testing and documentation to do, but basically, I was able to expand the AND / OR / NOT patterns at r

Re: FieldSelectorResult instance descriptions?

2006-10-09 Thread Chris Hostetter

: If you read the entire source as I did, I becomes clear ! :) : The interesting code is in FieldsReader. Not neccessarily. There can be differneces between how constants are used and how they are suppose to be used (depending on wether or not the code using them has any bugs in it) : NO_LOAD

Re: wildcard and span queries

2006-10-09 Thread Doron Cohen

"Erick Erickson" <[EMAIL PROTECTED]> wrote on 09/10/2006 13:09:21: > ... The kicker is that what we are indexing is > OCR data, some of which is pretty trashy. So you wind up with "interesting" > words in your index, things like rtyHrS. So the whole question of allowing > very specific queries on d

Re: wildcard and span queries

2006-10-09 Thread Erick Erickson

I've already started that conversation with the PM, I'm just trying to get a better idea of what's possible. I'll whimper tooth and nail to keep from having to do a lot of work to add a feature to a product that nobody in their right mind would ever use . As far as the grammar, we don't actually

Re: wildcard and span queries

2006-10-09 Thread Paul Elschot

Erick, On Monday 09 October 2006 21:20, Erick Erickson wrote: > OK, forget the stuff about "TooManyBooleanClauses". I finally figured out > that if I specify the surround to have the same semantics as a SpanRegex ( > i.e, and(eri*, mal*)) it blows up with TooManyBooleanClauses. So that makes > mor

Re: wildcard and span queries

2006-10-09 Thread Erick Erickson

OK, forget the stuff about "TooManyBooleanClauses". I finally figured out that if I specify the surround to have the same semantics as a SpanRegex ( i.e, and(eri*, mal*)) it blows up with TooManyBooleanClauses. So that makes more sense to me now. Specifying 20w(eri*, mal*) is what I was using bef

Re: wildcard and span queries

2006-10-09 Thread Erick Erickson

OK, I'm using the surround code, and it seems to be working...with the following questions (always, more questions)... I'm gettng an exception sometimes of TooManyBasicQueries. I can control this by initializing BasicQueryFactory with a larger number. Do you have any cautions about upping this

Re: Lucene searching algorithm

2006-10-09 Thread Grant Ingersoll

Hi Michael, I think there are a number of good resources on this: 1. http://lucene.apache.org/java/scoring.html covers the basics of searching. The bottom has some pseudo code as well. 2. Lucene In Action 3. Search this list and other places for information on the Vector Space Model.

Re: threadsafe QueryParser?

2006-10-09 Thread Yonik Seeley

On 10/9/06, Stanislav Jordanov <[EMAIL PROTECTED]> wrote: Method static public Query parse(String query, String field, Analyzer analyzer) in class QueryParser is deprecated in 1.9.1 and the suggestion is: /"Use an instance of QueryParser and the [EMAIL PROTECTED] #parse(String)} method instead."

Re: Incremental updates / slow searches.

2006-10-09 Thread Yonik Seeley

The biggest thing would be to limit how often you open a new IndexSearcher, and when you do, warm up the new searcher in the background while you continue serving searches with the existing searcher. This is the strategy that Solr uses. There is also the issue of if you are analyzing/merging doc

Re: TermQuery and PhraseQuery..problem with word with space

2006-10-09 Thread Ismail Siddiqui

in fav_stores i see "Banana Republic" and "Ann Taylor" there .. and i am searching it with the capitals. On 10/9/06, Erick Erickson <[EMAIL PROTECTED]> wrote: OK, when you look in the "fav_stores" field in Luke, what do you see? And, are you searching on "Banana Republic" with the capitals? I

Re: TermQuery and PhraseQuery..problem with word with space

2006-10-09 Thread Doron Cohen

I would guess that one of your assumptions is wrong... The assumptions to check are: At indexing: - lpf.getLuceneFieldName() == "fav_stores" - pa.getPersonProfileChoice().getChoice() == "Banana Republic" At search: - the query is created like this: new TermQuery(new Term("fav_stores","Banana R

Re: TermQuery and PhraseQuery..problem with word with space

2006-10-09 Thread Erick Erickson

OK, when you look in the "fav_stores" field in Luke, what do you see? And, are you searching on "Banana Republic" with the capitals? If so, and your index has the letters in lower case, that's your problem. Erick On 10/9/06, Ismail Siddiqui <[EMAIL PROTECTED]> wrote: I am using StandardAnalyz

Re: deleteDocuments being ingnored

2006-10-09 Thread Simon Willnauer

System.out.println("Indexing " + f.getAbsolutePath()); Document doc = new Document(); doc.add(new Field("contents",loadContents (doc),Field.Store.NO,Field.Index.TOKENIZED)); doc.add(new Field("filename", f.getAbsolutePath(),Field.Stor

Re: deleteDocuments being ingnored

2006-10-09 Thread cfowler

My apologies, the IndexReader code I included was a commented out trial. Here is the active version. Sorry for the error: IndexReader ir = IndexReader.open(indexDir); System.out.println(">>>" + ir.numDocs()); int deleted = ir.deleteDocuments(new Ter

deleteDocuments being ingnored

2006-10-09 Thread cfowler

Hello, I'm brand new to this, so hopefully you can help me. I'm attempting to use the IndexReader object in lucene v2 to delete and readd documents. I very easily set up an index and my documents are added. Now I'm trying to update the same index by deleting the document before readdin

Re: TermQuery and PhraseQuery..problem with word with space

2006-10-09 Thread Ismail Siddiqui

I am using StandardAnalyzer while indexing the field.. I am also a creatign a field called full_text in which i am adding all these individual fields as TOKENIZED. here is the code while(choiceIt.hasNext()){ PersonProfileAnswer pa=(PersonProfileAnswer)choiceIt.next(); if(p

Re: How to search with empty content

2006-10-09 Thread Scott

You can get all document by using MatchAllDocsQuery. Kumar, Samala Santhosh (TPKM) wrote: I want to search without giving any input, when I search leaving blank the search text box it should give me all the documents present in the index. please give me some solution or pointers. regards Sa

Re: Performing a like query

2006-10-09 Thread Steven Rowe

Hi Rahil, Rahil wrote: > I was just wondering whether there is a > difference between the regular expression you sent me i.e. > (i) \s*(?:\b|(?<=\S)(?=\s)|(?<=\s)(?=\S))\s* > >and > (ii) \\b > > as they lead to the same output. For example, the string search "testing > a-new string=3/4

How to search with empty content

2006-10-09 Thread Kumar, Samala Santhosh (TPKM)

I want to search without giving any input, when I search leaving blank the search text box it should give me all the documents present in the index. please give me some solution or pointers. regards Santhosh

Re: highlight optimization

2006-10-09 Thread Erick Erickson

The fastest way to see if opening/closing your searcher is a problem would be to write a tiny little program that opened the index, fired off a few queries and timed each one. The queries can be canned, of course. I'm thinking this is, say, less that 20 lines (including imports). If you're familia

threadsafe QueryParser?

2006-10-09 Thread Stanislav Jordanov

Method static public Query parse(String query, String field, Analyzer analyzer) in class QueryParser is deprecated in 1.9.1 and the suggestion is: /"Use an instance of QueryParser and the [EMAIL PROTECTED] #parse(String)} method instead."/ My question is: in the context of multi threaded app, is

highlight optimization

2006-10-09 Thread Stelios Eliakis

Hi, I have a collection of 500 txt documents and I implement a web application(JSP) for searching these documents. In addition, the application shows the BestFragment of each result and highlights the query terms. My application is slow enough (about 2,5-3 seconds for each query) even if I run it

Re: Performing a like query

2006-10-09 Thread Rahil

Hi Steve Thanks for your response. I was just wondering whether there is a difference between the regular expression you sent me i.e. (i) \s*(?:\b|(?<=\S)(?=\s)|(?<=\s)(?=\S))\s* and (ii) \\b as they lead to the same output. For example, the string search "testing a-new string=3/4

Incremental updates / slow searches.

2006-10-09 Thread Rickard Bäckman

Hi, we are using a search system based on Lucene and have recently tried to add incremental updating of the index instead of building a new index every now and then. However we now run into problems as our searches starts to take very long time to complete. Our index is about 8-9GB large and we

Re: lucene link database

2006-10-09 Thread mark harwood

>>if you search the archive for database you'll bet a bunch of threads This was a hybrid implementation I did which worked with HSQLDB and Derby: http://www.mail-archive.com/java-user@lucene.apache.org/msg02953.html Cheers Mark - Original Message From: Erick Erickson <[EMAIL PROTECTED

Re: TermQuery and PhraseQuery..problem with word with space

2006-10-09 Thread Doron Cohen

> I am trying to index a field which has more than one word with space e.g. > "My Word" > i am indexng it UN_TOKENIZED .. but when i use TermQuery to query "My Word" > its not yielding any result.. Seems that it should work. Few things to check: - make sure you are indexing with UN_TOKENIZED. - c

Re: Incremental updates / slow searches.

Re: Incremental updates / slow searches.

Re: Incremental updates / slow searches.

Re: wildcard and span queries

Re: FieldSelectorResult instance descriptions?

RE: QueryParser syntax French Operator

Re: FieldSelectorResult instance descriptions?

Re: wildcard and span queries

Re: wildcard and span queries

Re: wildcard and span queries

Re: wildcard and span queries

Re: wildcard and span queries

Re: Lucene searching algorithm

Re: threadsafe QueryParser?

Re: Incremental updates / slow searches.

Re: TermQuery and PhraseQuery..problem with word with space

Re: TermQuery and PhraseQuery..problem with word with space

Re: TermQuery and PhraseQuery..problem with word with space

Re: deleteDocuments being ingnored

Re: deleteDocuments being ingnored

deleteDocuments being ingnored

Re: TermQuery and PhraseQuery..problem with word with space

Re: How to search with empty content

Re: Performing a like query

How to search with empty content

Re: highlight optimization

threadsafe QueryParser?

highlight optimization

Re: Performing a like query

Incremental updates / slow searches.

Re: lucene link database

Re: TermQuery and PhraseQuery..problem with word with space

32 matches

Site Navigation

Mail list logo

Footer information