I’m working with Lucene 5.1 to try to make use of the relational structure of the block join index and query mechanisms. I’m querying with the following code:
IndexReader reader = DirectoryReader.open(index); ToParentBlockJoinIndexSearcher searcher = new ToParentBlockJoinIndexSearcher(reader); ToParentBlockJoinCollector collector = new ToParentBlockJoinCollector(Sort.RELEVANCE, 2, true, true); BitDocIdSetFilter codingScheme = new BitDocIdSetCachingWrapperFilter( new QueryWrapperFilter(new QueryParser("codingSchemeName", new StandardAnalyzer(new CharArraySet( 0, true))).parse(scheme.getCodingSchemeName()))); Query query = new QueryParser(null, new StandardAnalyzer(new CharArraySet( 0, true))).createBooleanQuery("propertyValue", term.getTerm(), Occur.MUST); ToParentBlockJoinQuery termJoinQuery = new ToParentBlockJoinQuery( query, codingScheme, ScoreMode.Avg); searcher.search(termJoinQuery, collector); To try to get parent values, but it fails on the final line with the following stack trace: Exception in thread "main" java.lang.IllegalStateException: child query must only match non-parent docs, but parent docID=2147483647 matched childScorer=class org.apache.lucene.search.TermScorer at org.apache.lucene.search.join.ToParentBlockJoinQuery$BlockJoinScorer.nextDoc(ToParentBlockJoinQuery.java:330) at org.apache.lucene.search.join.ToParentBlockJoinIndexSearcher.search(ToParentBlockJoinIndexSearcher.java:63) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:428) at org.lexevs.lucene.prototype.LuceneQueryTrial.luceneToParentJoinQuery(LuceneQueryTrial.java:78) at org.lexevs.lucene.prototype.LuceneQueryTrial.main(LuceneQueryTrial.java:327) I build indexes up to about 36Gb using a code similar to the following: List<Document> list = new ArrayList<Document>(); //need a static int staticCount = count; ParentDocObject parent = builder.generateParentDoc(cs.getCodingSchemeName(), cs.getVersion(), cs.getURI(), "description"); if (cs.codingSchemeName.equals(CodingScheme.THESSCHEME.codingSchemeName)) { //One per coding Scheme int numberOfProperties = 12; if(!thesExactMatchDone){ ChildDocObject child1 = builder.generateChildDocWithSalt(parent,SearchTerms.BLOOD.getTerm()); Document doc1 = builder.mapToDocumentExactMatch(child1); list.add(doc1); count++; numberOfProperties--; ChildDocObject child = builder.generateChildDocWithSalt(parent,SearchTerms.CHAR.term); Document doc = builder.mapToDocumentExactMatch(child); count++; list.add(doc); numberOfProperties--; thesExactMatchDone = true; } while (numberOfProperties > 0) { if(count % 547 == 0){ ChildDocObject child = builder.generateChildDocWithSalt(parent, builder.randomTextGenerator( builder.randomNumberGenerator(),SearchTerms.BLOOD.getTerm())); Document doc = builder.mapToDocument(child); list.add(doc); count++;numberOfProperties--; }else if(count % 233 == 0){ ChildDocObject child = builder.generateChildDocWithSalt(parent, builder.randomTextGenerator( builder.randomNumberGenerator(),SearchTerms.CHAR.getTerm())); Document doc = builder.mapToDocument(child); list.add(doc); count++;numberOfProperties--; }else if(count % 71 == 0){ ChildDocObject child = builder.generateChildDocWithSalt(parent, builder.randomTextGenerator( builder.randomNumberGenerator(),SearchTerms.ARTICLE.getTerm())); Document doc = builder.mapToDocument(child); list.add(doc); count++;numberOfProperties--; }else if(count % 2237 == 0){ ChildDocObject child = builder.generateChildDocWithSalt(parent, builder.randomTextGenerator( builder.randomNumberGenerator(),SearchTerms.LUNG_CANCER.getTerm())); Document doc = builder.mapToDocument(child); list.add(doc); count++;numberOfProperties--; }else if(count % 5077 == 0){ ChildDocObject child = builder.generateChildDocWithSalt(parent, builder.randomTextGenerator( builder.randomNumberGenerator(),SearchTerms.LIVER_CARCINOMA.getTerm())); Document doc = builder.mapToDocument(child); list.add(doc); count++;numberOfProperties--; }else if(count % 2371 == 0){ ChildDocObject child = builder.generateChildDocWithSalt(parent, builder.randomTextGeneratorStartsWith( builder.randomNumberGenerator(),SearchTerms.BLOOD.getTerm())); Document doc = builder.mapToDocumentExactMatch(child); list.add(doc); count++;numberOfProperties--; }else if(count % 79 == 0){ ChildDocObject child = builder.generateChildDocWithSalt(parent, builder.randomTextGeneratorStartsWith( builder.randomNumberGenerator(),SearchTerms.ARTICLE.getTerm())); Document doc = builder.mapToDocumentExactMatch(child); list.add(doc); count++;numberOfProperties--; }else if(count % 3581 == 0){ ChildDocObject child = builder.generateChildDocWithSalt(parent, builder.randomTextGeneratorStartsWith( builder.randomNumberGenerator(),SearchTerms.LUNG_CANCER.getTerm())); Document doc = builder.mapToDocumentExactMatch(child); list.add(doc); count++;numberOfProperties--; }else if(count % 23 == 0){ ChildDocObject child = builder.generateChildDocWithSalt(parent, builder.randomTextGeneratorStartsWith( builder.randomNumberGenerator(),SearchTerms.CHAR.getTerm())); Document doc = builder.mapToDocumentExactMatch(child); list.add(doc); count++;numberOfProperties--; } else { ChildDocObject child = builder.generateChildDoc(parent); Document doc = builder.mapToDocument(child); list.add(doc); count++; numberOfProperties--; } } } Document par = builder.mapToDocument(parent); list.add(par); writer.addDocuments(list); } Which works pretty well until I scale it up using several instances of this. When the nextChildDoc document retrieved gets to id 5874902 the line in ToParentBlockJoinQuery parentDoc = parentBits.nextSetBit(nextChildDoc); Gives the value 2147483647 to the parentDoc, which is not a document id in my index if I understand lucene and Luke correctly since my index has only 42716877 documents. Can someone shed some light on this exception? Thanks, Scott Bauer