Well it’s clear that this is just giving a return value of Integer.MAX_VALUE for the parentDoc. Given the recent changes noted here: https://issues.apache.org/jira/browse/LUCENE-6021 where FixedBitSet now returns Integer.MAX_VALUE instead of -1 I wonder if a bug wasn’t introduced to the BlockJoinScorer.nextDoc method. Unfortunately I have yet to come up with an example to make this fail on a smaller test index. The child document in question does have a parent, which is doc #4823684, so I’m confused as to how the NO_MORE_DOCS value would be applied. Is there something obvious I’m missing here?
On 6/5/15, 12:05 PM, "Bauer, Herbert S. (Scott)" <bauer.sc...@mayo.edu> wrote: >One correction, it looks like the parentBits call has 4823680 passed to it >to generate the erroneous docId. > >On 6/5/15, 10:34 AM, "Bauer, Herbert S. (Scott)" <bauer.sc...@mayo.edu> >wrote: > >>I should mention that this worked in 4.10.4 using a very similar code >>base. -scott >> >>On 6/4/15, 4:51 PM, "Bauer, Herbert S. (Scott)" <bauer.sc...@mayo.edu> >>wrote: >> >>>I¹m working with Lucene 5.1 to try to make use of the relational >>>structure of the block join index and query mechanisms. I¹m querying >>>with the following code: >>> >>>IndexReader reader = DirectoryReader.open(index); >>> >>>ToParentBlockJoinIndexSearcher searcher = new >>>ToParentBlockJoinIndexSearcher(reader); >>> >>>ToParentBlockJoinCollector collector = new >>>ToParentBlockJoinCollector(Sort.RELEVANCE, 2, true, true); >>> >>>BitDocIdSetFilter codingScheme = new BitDocIdSetCachingWrapperFilter( >>> >>> new QueryWrapperFilter(new >>>QueryParser("codingSchemeName", new StandardAnalyzer(new CharArraySet( >>>0, >>>true))).parse(scheme.getCodingSchemeName()))); >>> >>> Query query = new QueryParser(null, new StandardAnalyzer(new >>>CharArraySet( 0, true))).createBooleanQuery("propertyValue", >>>term.getTerm(), Occur.MUST); >>> >>> ToParentBlockJoinQuery termJoinQuery = new ToParentBlockJoinQuery( >>> >>> query, >>> >>> codingScheme, >>> >>> ScoreMode.Avg); >>> >>> searcher.search(termJoinQuery, collector); >>> >>> >>>To try to get parent values, but it fails on the final line with the >>>following stack trace: >>> >>> >>>Exception in thread "main" java.lang.IllegalStateException: child query >>>must only match non-parent docs, but parent docID=2147483647 matched >>>childScorer=class org.apache.lucene.search.TermScorer >>> >>>at >>>org.apache.lucene.search.join.ToParentBlockJoinQuery$BlockJoinScorer.nex >>>t >>>D >>>oc(ToParentBlockJoinQuery.java:330) >>> >>>at >>>org.apache.lucene.search.join.ToParentBlockJoinIndexSearcher.search(ToPa >>>r >>>e >>>ntBlockJoinIndexSearcher.java:63) >>> >>>at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:428) >>> >>>at >>>org.lexevs.lucene.prototype.LuceneQueryTrial.luceneToParentJoinQuery(Luc >>>e >>>n >>>eQueryTrial.java:78) >>> >>>at >>>org.lexevs.lucene.prototype.LuceneQueryTrial.main(LuceneQueryTrial.java: >>>3 >>>2 >>>7) >>> >>> >>>I build indexes up to about 36Gb using a code similar to the following: >>> >>> >>>List<Document> list = new ArrayList<Document>(); >>> >>>//need a static >>> >>>int staticCount = count; >>> >>>ParentDocObject parent = >>>builder.generateParentDoc(cs.getCodingSchemeName(), >>> >>>cs.getVersion(), cs.getURI(), "description"); >>> >>>if >>>(cs.codingSchemeName.equals(CodingScheme.THESSCHEME.codingSchemeName)) >>>{ >>> >>>//One per coding Scheme >>> >>>int numberOfProperties = 12; >>> >>>if(!thesExactMatchDone){ >>> >>>ChildDocObject child1 = >>>builder.generateChildDocWithSalt(parent,SearchTerms.BLOOD.getTerm()); >>> >>>Document doc1 = builder.mapToDocumentExactMatch(child1); >>> >>>list.add(doc1); >>> >>>count++; >>> >>>numberOfProperties--; >>> >>>ChildDocObject child = >>>builder.generateChildDocWithSalt(parent,SearchTerms.CHAR.term); >>> >>>Document doc = builder.mapToDocumentExactMatch(child); >>> >>>count++; >>> >>>list.add(doc); >>> >>>numberOfProperties--; >>> >>>thesExactMatchDone = true; >>> >>>} >>> >>>while (numberOfProperties > 0) { >>> >>>if(count % 547 == 0){ >>> >>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>> >>>builder.randomTextGenerator( >>> >>>builder.randomNumberGenerator(),SearchTerms.BLOOD.getTerm())); >>> >>>Document doc = builder.mapToDocument(child); >>> >>>list.add(doc); >>> >>>count++;numberOfProperties--; >>> >>>}else if(count % 233 == 0){ >>> >>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>> >>>builder.randomTextGenerator( >>> >>>builder.randomNumberGenerator(),SearchTerms.CHAR.getTerm())); >>> >>>Document doc = builder.mapToDocument(child); >>> >>>list.add(doc); >>> >>>count++;numberOfProperties--; >>> >>>}else if(count % 71 == 0){ >>> >>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>> >>>builder.randomTextGenerator( >>> >>>builder.randomNumberGenerator(),SearchTerms.ARTICLE.getTerm())); >>> >>>Document doc = builder.mapToDocument(child); >>> >>>list.add(doc); >>> >>>count++;numberOfProperties--; >>> >>>}else if(count % 2237 == 0){ >>> >>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>> >>>builder.randomTextGenerator( >>> >>>builder.randomNumberGenerator(),SearchTerms.LUNG_CANCER.getTerm())); >>> >>>Document doc = builder.mapToDocument(child); >>> >>>list.add(doc); >>> >>>count++;numberOfProperties--; >>> >>>}else if(count % 5077 == 0){ >>> >>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>> >>>builder.randomTextGenerator( >>> >>>builder.randomNumberGenerator(),SearchTerms.LIVER_CARCINOMA.getTerm())); >>> >>>Document doc = builder.mapToDocument(child); >>> >>>list.add(doc); >>> >>>count++;numberOfProperties--; >>> >>>}else if(count % 2371 == 0){ >>> >>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>> >>>builder.randomTextGeneratorStartsWith( >>> >>>builder.randomNumberGenerator(),SearchTerms.BLOOD.getTerm())); >>> >>>Document doc = builder.mapToDocumentExactMatch(child); >>> >>>list.add(doc); >>> >>>count++;numberOfProperties--; >>> >>>}else if(count % 79 == 0){ >>> >>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>> >>>builder.randomTextGeneratorStartsWith( >>> >>>builder.randomNumberGenerator(),SearchTerms.ARTICLE.getTerm())); >>> >>>Document doc = builder.mapToDocumentExactMatch(child); >>> >>>list.add(doc); >>> >>>count++;numberOfProperties--; >>> >>>}else if(count % 3581 == 0){ >>> >>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>> >>>builder.randomTextGeneratorStartsWith( >>> >>>builder.randomNumberGenerator(),SearchTerms.LUNG_CANCER.getTerm())); >>> >>>Document doc = builder.mapToDocumentExactMatch(child); >>> >>>list.add(doc); >>> >>>count++;numberOfProperties--; >>> >>>}else if(count % 23 == 0){ >>> >>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>> >>>builder.randomTextGeneratorStartsWith( >>> >>>builder.randomNumberGenerator(),SearchTerms.CHAR.getTerm())); >>> >>>Document doc = builder.mapToDocumentExactMatch(child); >>> >>>list.add(doc); >>> >>>count++;numberOfProperties--; >>> >>>} else { >>> >>>ChildDocObject child = builder.generateChildDoc(parent); >>> >>>Document doc = builder.mapToDocument(child); >>> >>>list.add(doc); >>> >>>count++; >>> >>>numberOfProperties--; >>> >>>} >>> >>>} >>> >>>} >>> >>>Document par = builder.mapToDocument(parent); >>> >>>list.add(par); >>> >>>writer.addDocuments(list); >>> >>>} >>> >>> >>>Which works pretty well until I scale it up using several instances of >>>this. When the nextChildDoc document retrieved gets to id 5874902 the >>>line in ToParentBlockJoinQuery >>> >>> >>> parentDoc = parentBits.nextSetBit(nextChildDoc); >>> >>> >>>Gives the value 2147483647 to the parentDoc, which is not a document id >>>in my index if I understand lucene and Luke correctly since my index has >>>only 42716877 documents. >>> >>>Can someone shed some light on this exception? >>> >>> >>>Thanks, >>> >>>Scott Bauer >>> >>> >>> >>> >> >> >>--------------------------------------------------------------------- >>To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>For additional commands, e-mail: java-user-h...@lucene.apache.org >> >