Re: How to include some more fields to be indexed in the file document class?

2009-12-04 Thread DHIVYA M
Thanx for the suggestion sir. But i wrote the Document of FileDocument class here in my indexing program so that it vl look this method rather than refering the one from the jar file.   Updating the jar by creating a class again seemed to be time consuming for me so did this way.   Thanks, Dhivya

Re: searchWithFilter bug?

2009-12-04 Thread Simon Willnauer
On Fri, Dec 4, 2009 at 7:09 PM, Michael McCandless wrote: > On Fri, Dec 4, 2009 at 12:53 PM, Simon Willnauer > wrote: > >> @Mike: maybe we should add a testcase / method in TestFilteredSearch >> that searches on more than one segment. > Working on it... will open an issue in a bit. > I agree, we

Re: searchWithFilter bug?

2009-12-04 Thread Michael McCandless
On Fri, Dec 4, 2009 at 12:53 PM, Simon Willnauer wrote: > @Mike: maybe we should add a testcase / method in TestFilteredSearch > that searches on more than one segment. I agree, we should -- wanna cough up a patch? Mike - To u

Re: searchWithFilter bug?

2009-12-04 Thread Simon Willnauer
-- Forwarded message -- From: Simon Willnauer Date: Fri, Dec 4, 2009 at 6:53 PM Subject: Re: searchWithFilter bug? To: Peter Keegan Peter, since search is per segment you need to use the segment reader passed in during search to create you DocIdSet if you use absolute docID your

Re: searchWithFilter bug?

2009-12-04 Thread Peter Keegan
The filter is just a java.util.BitSet. I use the top level reader to create the filter, and call IndexSearcher.search (Query, Filter, HitCollector). So, there is no 'docBase' at this level of the api. Peter On Fri, Dec 4, 2009 at 11:01 AM, Simon Willnauer < simon.willna...@googlemail.com> wrote:

Re: searchWithFilter bug?

2009-12-04 Thread Simon Willnauer
Peter, which filter do you use, do you respect the IndexReaders maxDoc() and the docBase? simon On Fri, Dec 4, 2009 at 4:47 PM, Peter Keegan wrote: > I think the Filter's docIdSetIterator is using the top level reader for each > segment, because the cardinality of the DocIdSet from which it's cr

Re: searchWithFilter bug?

2009-12-04 Thread Peter Keegan
I think the Filter's docIdSetIterator is using the top level reader for each segment, because the cardinality of the DocIdSet from which it's created is the same for all readers (and what I expect to see at the top level. Peter On Fri, Dec 4, 2009 at 10:38 AM, Michael McCandless < luc...@mikemcca

Re: searchWithFilter bug?

2009-12-04 Thread Michael McCandless
That doesn't sound good. Though, in searchWithFilter, we seem to ask for the Query's scorer, and the Filter's docIdSetIterator, using the same reader (which may be toplevel, for the legacy case, or per-segment, for the normal case). So I'm not [yet] seeing where the issue is... Can you boil it do

searchWithFilter bug?

2009-12-04 Thread Peter Keegan
I'm having a problem with 'searchWithFilter' on Lucene 2.9.1. The Filter wraps a simple BitSet. When doing a 'MatchAllDocs' query with this filter, I get only a subset of the expected results, even accounting for deletes. The index has 10 segments. In IndexSearcher->searchWithFilter, it looks like

Re: IndexDivisor

2009-12-04 Thread Ganesh
I didn't run with profiler. I created a test app and run that.. I am opening multiple database. IndexReader opened with IndexDivisor: 100 //Open the reader with the divisor value TermCount: 7046764 //Available unique terms in the db Warmup done:

Re: How to do relevancy ranking in lucene

2009-12-04 Thread Erick Erickson
Hmmm, I don't know the underlying scoring code well enough to answer off the top of my head. But if you have the source code, I'd examine the junit tests (the class names should give you a strong hint) and start from there. Best Erick On Fri, Dec 4, 2009 at 12:15 AM, DHIVYA M wrote: > yes ofcour

Re: Norm Value of not existing Field

2009-12-04 Thread Erick Erickson
The word "Filter" as part of a class is overloaded in Lucene See: http://lucene.apache.org/java/2_9_1/api/all/index.html The above filter is just a DocIdSet, one bit per document. So in your example, you're only talking 12M or so, even if you create one filter for every field and keep it aro

Re: IndexDivisor

2009-12-04 Thread Michael McCandless
I'm confused -- what are these attachments? Output from a memory profiler? Can you post the app you created? Mike On Fri, Dec 4, 2009 at 12:24 AM, Ganesh wrote: > Thanks mike.. > > Please find the attached file. I ran the testing for 1,100,1000,1 divisor > value.  There is difference from

Re: updating index

2009-12-04 Thread Ian Lea
writer.updateDocument(new Term("id", ""+i), doc); Read the javadocs! Haven't we been here before? -- Ian. On Fri, Dec 4, 2009 at 10:30 AM, m.harig wrote: > > hello all > >        how do i update my existing index to avoid my duplicates , this is > how am doing my indexing > >   doc.add(new F

updating index

2009-12-04 Thread m.harig
hello all how do i update my existing index to avoid my duplicates , this is how am doing my indexing doc.add(new Field("id",""+i,Field.Store.YES,Field.Index.NOT_ANALYZED)); doc.add(new Field("title", indexForm.getTitle(), Field.Store.YES,

Re: How to include some more fields to be indexed in the file document class?

2009-12-04 Thread Anshum
Hi Dhivya, So are you using the same demo code for your app? Incase you are you have to modify that code and continue. All said and done, you'd have to add fields in your java file and recompile(in case you are already using some code for that purpose). In case you would be starting to write an ind

How to include some more fields to be indexed in the file document class?

2009-12-04 Thread DHIVYA M
Hi all,   Am using lucene 2.3.2. I would like to include some more fields of the to be indexed other than the available one.   In the FileDocument class of the demo version of lucene 2.3.2 there are only three fields added to the documents to be indexed.   Ex: doc.add(new Field("path"..  

Re: Norm Value of not existing Field

2009-12-04 Thread Benjamin Heilbrunn
Erick, I'm not sure if I understand you right. What do you mean by "spinning through all the terms on a field". It would be an option to load all unique terms of a field by using TermEnum. Than use TermDocs to get the docs to those terms. The rest of docs doesn't contain a term and so you know, th