4. Search for records with filter.
if the filter returns a lot of ids, it willn' t be fast. Recently I have a test. I customized a filter which get a list of ids from a mysql database table of size 5000. Then I invoke the search(query, filter, hitcollector), I took me more than 40s to retrieve the first 100 hits. Then I try search(query, hitcollector), the fragment code is following:
public MyHitCollector1(ArrayList list , String[] allIDs, ArrayList idList) { this.list = list; this.idList = idList; this.allIDs = allIDs; } public override void Collect(int doc, float score) { if (score > 0.0f) { if (count < 1000) { if (idList.BinarySearch(allIDs[doc]) >= 0) { list.Add(IndexSearcher.GetDocument(doc)); count++; } } else return; } } you can get String[] allIDs = FieldCache.DEFAULT.GetStrings(IndexReader(), "ID") in this way. This method took about 8s to retrieve the first 100/1000 hits. Here the database table has about 300,000 records 2006/8/11, Aleksei Valikov <[EMAIL PROTECTED]>:
Hi. I'm investigating a possibility to make a "join" in Lucene/Compass. Here's the thread: http://forums.opensymphony.com/thread.jspa?threadID=39685&tstart=0 I have records m:m entities. Entities hold indexed information. Records consist of entities. One entity may belong to many records. I would like to search for records having certain entity information. Entity documents contain indexable entity fields plus entity id. Record documents contain indexable record fielfs, record id and entity ids. I'd like to "search for records having entity ids in (search for entity ids where entity fields satisfy condition)". Currently I am using self-written InSetFilter to accomplish the task. 1. Search among entities by the given condition. 2. Put ids of the found entities into a set. 3. Create filter with this set. 4. Search for records with filter. The join is basically implemented by a filter: public BitSet bits(IndexReader reader) throws IOException { BitSet bits = new BitSet(reader.maxDoc()); int[] docs = new int[1]; int[] freqs = new int[1]; for (Iterator iterator = set.iterator(); iterator.hasNext();) { final String id = (String) iterator.next(); final TermDocs termDocs = reader.termDocs(new Term(fieldName, id)); final int count = termDocs.read(docs, freqs); if (count == 1) { bits.set(docs[0]); } } return bits; } Is this approach ok or I missed something and there's an easier way to join? Thank you for you time. Bye. /lexi --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]