Query in IndexWriter.deleteDocuments(Term term)

2008-07-25 Thread java_is_everything
Hi all. This may seem a longish and informal mail, but do correct me if my assumptions are wrong anywhere, otherwise my actual doubt will make no sense. Say I opened an IndexWriter on an initially empty directory, using autocommit = true. Now, what I do is add and delete documents randomly. I se

Re: How to use lucene for high search performance ?

2008-07-25 Thread Michael McCandless
Let's move this thread to java-user (CC'd). 王建新 wrote: Thank you. If the index files are very big(10G), I cannot load them to ram in one process. Ahh OK. Shoud I use MutilSearcher to load index files with serval processes? How about its performance? MultiSearcher alone doesn't really

Re: Deleted documents in the index.

2008-07-25 Thread ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S)
Hey Michael, The maxDoc() did the trick ! Thanks ! I have got some reading to do about numDocs() and maxDoc(). Nagesh On 7/25/08, ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S) <[EMAIL PROTECTED]> wrote: > Hi Michael, > The numDocs did come from IndexReader.numDocs(). > > hmm...let me try with maxDoc. > > N

Re: caching fields for query performance

2008-07-25 Thread Yonik Seeley
Yes, pull out language:ENG language:FRA language:RUS into filters, cache them, and re-use them across all queries. In Lucene, see CachingWrapperFilter() In Solr, use a separate fq (for filter query) parameter... q=+topic:m&a +topic:earn +company:MSFT&fq=language:ENG -Yonik On Fri, Jul 25, 200

Re: Deleted documents in the index.

2008-07-25 Thread ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S)
Hi Michael, The numDocs did come from IndexReader.numDocs(). hmm...let me try with maxDoc. Nagesh On 7/25/08, Michael McCandless <[EMAIL PROTECTED]> wrote: > > Oh, I think I see the problem -- instead of numDocs in your for loop > (which I assume came from IndexReader.numDocs()) change that to m

Re: Deleted documents in the index.

2008-07-25 Thread Michael McCandless
Oh, I think I see the problem -- instead of numDocs in your for loop (which I assume came from IndexReader.numDocs()) change that to maxDoc (IndexReader.maxDoc()). Mike (Nagesh S) wrote: Hi Michael, Thanks for your response. Yes, I got that. I guess, my question is, how do I access the

Document path in lucene index

2008-07-25 Thread starz10de
Hi All, I am reading the index and printing the index terms and their corresponding paths. I can print the index terms but I don't know if there is any possibilites to print the coressbonds paths, i can just print the docid, but i need to print the paths as it is possible in searcher (query).

Re: Ignoring XML tags when Indexing

2008-07-25 Thread Kalani Ruwanpathirana
Thanks both of you :) I will try that out Kalani On Fri, Jul 25, 2008 at 5:08 PM, Marcelo Schneider < [EMAIL PROTECTED]> wrote: > Daniel Noll wrote: > >> What makes more sense (at least the way I see it) is to implement a Reader >> which returns the text you need from the XML. This sort of thin

Re: Deleted documents in the index.

2008-07-25 Thread ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S)
Hi Michael, Thanks for your response. Yes, I got that. I guess, my question is, how do I access the newly added document ? In other words, if the index initially had 20 docs of which 10 were updated (that is, deleted and then added), how do I access the updated ones ? Initially, there was no chec

Re: Lucene write locks

2008-07-25 Thread Michael McCandless
It's not enough to have a synchronized block only, unless eg within the synchronized block you open the IndexWriter, add docs, close it. Basically, you must also ensure only one instance of IndexWriter is open at once. If you try to open another instance of IndexWriter while a previous

Re: Deleted documents in the index.

2008-07-25 Thread Michael McCandless
When you call updateDocument, the old document is deleted but a wholly new document is added. So the "else" clause in your loop below will report on the newly added documents (you won't miss any). Mike (Nagesh S) wrote: Hi, I think, the earlier mail didn't make it through. I am writin

Deleted documents in the index.

2008-07-25 Thread ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S)
Hi, I think, the earlier mail didn't make it through. I am writing a class to report on an index. This index has documents updated using the IndexWriter.updateDocument(Term, Document) method. That is, documents were deleted and added again. My aim is to see what documents (and their fields) are pr

caching fields for query performance

2008-07-25 Thread Robert Stewart
If I have a frequently queried field, which has a single value per document (such as language), how can I pre-cache all field values, such that the underlying query processing always uses memory cache (never disk i/o) for that particular field? I don't know if it is possible without some custom

Re: Ignoring XML tags when Indexing

2008-07-25 Thread Marcelo Schneider
Daniel Noll wrote: What makes more sense (at least the way I see it) is to implement a Reader which returns the text you need from the XML. This sort of thing is relatively simple to do with the newer StAX API. You can have your reader return even small chunks of text, and it should perform

Deleted documents in the index.

2008-07-25 Thread ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S)
Hi, I am writing a class to report on an index. This index has documents updated using the IndexWriter.updateDocument(Term, Document) method. That is, documents were deleted and added again. My aim is to see what documents (and their fields) are present in the index. Since the document was updated

Re: Lucene write locks

2008-07-25 Thread Sandeep K
Hi Mike, sorry i didn't get much of it. i tried with synchronized block also. but it was a failure. will it be enough to have a synchronized block only? if each instance of IndexWriter is thread safe, then do we need to check whether its open or closed each time? why it's not doing the work eve

Re: Lucene write locks

2008-07-25 Thread Michael McCandless
I think the simplest approach is to use a synchronized block in your application to check if an IndexWriter is currently open, and if so, use that one to add your docs, else, open a new one and store it in a central place where the next message to come in can go and find it? Each instance

Re: Range Query Question

2008-07-25 Thread daniel rosher
Hi Thomas, I think one solution would be similar to the autocomplete function I've implemented in solr, you can use this as follows in solr: FieldType: This can then match on the whole string OR part of the string. To use the QueryParser, you'd not be using the query part of the ana

Re: Range Query Question

2008-07-25 Thread Thomas Becker
Btw. I tried the wildcard since I found something on google, which noted wildcards together with StartsWith queries. Thomas Becker wrote: Hi Ian, no the wild cards should not be necessary. That was just the last try out of some. I now the exact content of both fields in my range query. The c

Re: Range Query Question

2008-07-25 Thread Thomas Becker
Hi Ian, no the wild cards should not be necessary. That was just the last try out of some. I now the exact content of both fields in my range query. The case is as the java code found it, but the analyzer will lowercase it anyhow. I'm trying the SimpleAnalyzer since all other seem to ommit si

Re: Range Query Question

2008-07-25 Thread Ian Lea
Hi Are you sure your range queries should have wild card asterisks on the end? Looks odd to me and I don't know what the effect would be. I'd also prefer everything in lower case but maybe you've got the right analyzers being used consistently in indexing and searching chains. -- Ian. On F

Range Query Question

2008-07-25 Thread Thomas Becker
Hi all, I need to replace some db queries with lucene due to response time issues for sure. In this special case I need to do a range query on a field and a prefix query. I'm trying to prepare and try my query in luke with no success before migrating it to java. I need to find all names star

Re: Lucene write locks

2008-07-25 Thread Sandeep K
Hey Michael, Thanks a lot for ur time. On the JMS side I have the code which uses IndexWriter. I have to index the files that are getting uploaded. It's fine all the way if i upload ane file. But when i tried with 2-3 files, it giving me the following error. org.apache.lucene.store.LockObtainFa