The Hits class collects the document ids from the query in batches. If you iterate beyond what was collected, the query is re-executed to collect more ids.
You can use the expert level search methods on IndexSearcher if this isn't what you want. -Yonik On 9/8/05, Richard Krenek <[EMAIL PROTECTED]> wrote: > > I understand that for the query, but why does it matter once you have the > Hits object? That is the part I'm baffled on. The query with the wildcard > in > the front takes a lot longer, but we expected that. > > On 9/8/05, Jeremy Meyer <[EMAIL PROTECTED]> wrote: > > > > The issue isn't with multiple wildcards exactly. Specifically, the > problem > > is if the query starts with a wildcard. In the case where it starts with > a > > wildcard, lucene has no option but to linearly go over every term in the > > index to see if it matches your pattern. It must visit every singe term > in > > the index. If it doesn't start with a wildcard, lucene can skip to the > > relevant part of the index and only visit the relevant terms. For this > > reason, many people that use Lucene choose to disable having wildcard at > the > > start of a search term. This is discussed in the "Lucene in Action" > book. > > > > ~Jack~ > > > > >>Hello All, > > >>I am getting some weird time results when retrieving documents back > from > > a hits object. I am just timing this bit of code: > > >>Hits hits = searcher.search(query); > > >>long startTime = System.currentTimeMillis(); for (int i = 0; i < > > hits.length(); i++) { Document doc = hits.doc(i); String field = doc.get > (defaultField); > > } System.out.println("Cycle Time: "+(System.currentTimeMillis > > ()-startTime)); > > >> > > >>It seems when I have a wilcard query like *abcd* vs weqrew*, the > *abcd* > > query will always take longer to retrieve the documents even if they are > of > > simular result sizes. We are talking a big difference 1 second vs 16. It > is > > consistent no matter >>what order I run the queries in, terms with > multiple > > wildcards always take longer to retrieve the documents. I am not > counting > > the time of the query. > > >> > > >>The index is 2.18 GB, 9 fields per document, 10,694,190 documents, > > >>25,538,793 terms and has been optimized. > > >> > > >>I am not sure if this is a real or just a percieved issue. We cannot > > figure out why the type of query would affect the speed it takes to > retrieve > > each document. We have run this on both Windows XP and Linux. With the > same > > results. Also to >>note we did watch GC and this did not have any > > significant impact that we could se. > > >> > > >>We are trying to figure out what could cause this and how we can work > > around it. > > >> > > >> > > >>Thanks, > > >>Richard > > > > -- -Yonik Now hiring -- http://tinyurl.com/7m67g