I’m guessing this issue may be related to the SOLR error described here: https://issues.apache.org/jira/browse/SOLR-7606. I can find at least one group of documents with a missing parent in my generated index. This doesn’t explain why I didn’t see a similar issue in 4.10.4. I can see that the BitSet implementation isn’t the issue but the filtered bit set inside it may be causing the problem given a missing parent. I have to say I’m a little concerned about the lack of feedback on this list. Is there another forum that is a little more active on this subject or is the block join implementation just not used or supported that much?
On 6/22/15, 2:21 PM, "Bauer, Herbert S. (Scott)" <bauer.sc...@mayo.edu> wrote: >Well it’s clear that this is just giving a return value of >Integer.MAX_VALUE for the parentDoc. Given the recent changes noted here: > https://issues.apache.org/jira/browse/LUCENE-6021 where FixedBitSet now >returns Integer.MAX_VALUE instead of -1 I wonder if a bug wasn’t >introduced to the BlockJoinScorer.nextDoc method. Unfortunately I have >yet to come up with an example to make this fail on a smaller test index. >The child document in question does have a parent, which is doc #4823684, >so I’m confused as to how the NO_MORE_DOCS value would be applied. Is >there something obvious I’m missing here? > >On 6/5/15, 12:05 PM, "Bauer, Herbert S. (Scott)" <bauer.sc...@mayo.edu> >wrote: > >>One correction, it looks like the parentBits call has 4823680 passed to >>it >>to generate the erroneous docId. >> >>On 6/5/15, 10:34 AM, "Bauer, Herbert S. (Scott)" <bauer.sc...@mayo.edu> >>wrote: >> >>>I should mention that this worked in 4.10.4 using a very similar code >>>base. -scott >>> >>>On 6/4/15, 4:51 PM, "Bauer, Herbert S. (Scott)" <bauer.sc...@mayo.edu> >>>wrote: >>> >>>>I¹m working with Lucene 5.1 to try to make use of the relational >>>>structure of the block join index and query mechanisms. I¹m querying >>>>with the following code: >>>> >>>>IndexReader reader = DirectoryReader.open(index); >>>> >>>>ToParentBlockJoinIndexSearcher searcher = new >>>>ToParentBlockJoinIndexSearcher(reader); >>>> >>>>ToParentBlockJoinCollector collector = new >>>>ToParentBlockJoinCollector(Sort.RELEVANCE, 2, true, true); >>>> >>>>BitDocIdSetFilter codingScheme = new BitDocIdSetCachingWrapperFilter( >>>> >>>> new QueryWrapperFilter(new >>>>QueryParser("codingSchemeName", new StandardAnalyzer(new CharArraySet( >>>>0, >>>>true))).parse(scheme.getCodingSchemeName()))); >>>> >>>> Query query = new QueryParser(null, new StandardAnalyzer(new >>>>CharArraySet( 0, true))).createBooleanQuery("propertyValue", >>>>term.getTerm(), Occur.MUST); >>>> >>>> ToParentBlockJoinQuery termJoinQuery = new ToParentBlockJoinQuery( >>>> >>>> query, >>>> >>>> codingScheme, >>>> >>>> ScoreMode.Avg); >>>> >>>> searcher.search(termJoinQuery, collector); >>>> >>>> >>>>To try to get parent values, but it fails on the final line with the >>>>following stack trace: >>>> >>>> >>>>Exception in thread "main" java.lang.IllegalStateException: child query >>>>must only match non-parent docs, but parent docID=2147483647 matched >>>>childScorer=class org.apache.lucene.search.TermScorer >>>> >>>>at >>>>org.apache.lucene.search.join.ToParentBlockJoinQuery$BlockJoinScorer.ne >>>>x >>>>t >>>>D >>>>oc(ToParentBlockJoinQuery.java:330) >>>> >>>>at >>>>org.apache.lucene.search.join.ToParentBlockJoinIndexSearcher.search(ToP >>>>a >>>>r >>>>e >>>>ntBlockJoinIndexSearcher.java:63) >>>> >>>>at >>>>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:428) >>>> >>>>at >>>>org.lexevs.lucene.prototype.LuceneQueryTrial.luceneToParentJoinQuery(Lu >>>>c >>>>e >>>>n >>>>eQueryTrial.java:78) >>>> >>>>at >>>>org.lexevs.lucene.prototype.LuceneQueryTrial.main(LuceneQueryTrial.java >>>>: >>>>3 >>>>2 >>>>7) >>>> >>>> >>>>I build indexes up to about 36Gb using a code similar to the following: >>>> >>>> >>>>List<Document> list = new ArrayList<Document>(); >>>> >>>>//need a static >>>> >>>>int staticCount = count; >>>> >>>>ParentDocObject parent = >>>>builder.generateParentDoc(cs.getCodingSchemeName(), >>>> >>>>cs.getVersion(), cs.getURI(), "description"); >>>> >>>>if >>>>(cs.codingSchemeName.equals(CodingScheme.THESSCHEME.codingSchemeName)) >>>>{ >>>> >>>>//One per coding Scheme >>>> >>>>int numberOfProperties = 12; >>>> >>>>if(!thesExactMatchDone){ >>>> >>>>ChildDocObject child1 = >>>>builder.generateChildDocWithSalt(parent,SearchTerms.BLOOD.getTerm()); >>>> >>>>Document doc1 = builder.mapToDocumentExactMatch(child1); >>>> >>>>list.add(doc1); >>>> >>>>count++; >>>> >>>>numberOfProperties--; >>>> >>>>ChildDocObject child = >>>>builder.generateChildDocWithSalt(parent,SearchTerms.CHAR.term); >>>> >>>>Document doc = builder.mapToDocumentExactMatch(child); >>>> >>>>count++; >>>> >>>>list.add(doc); >>>> >>>>numberOfProperties--; >>>> >>>>thesExactMatchDone = true; >>>> >>>>} >>>> >>>>while (numberOfProperties > 0) { >>>> >>>>if(count % 547 == 0){ >>>> >>>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>>> >>>>builder.randomTextGenerator( >>>> >>>>builder.randomNumberGenerator(),SearchTerms.BLOOD.getTerm())); >>>> >>>>Document doc = builder.mapToDocument(child); >>>> >>>>list.add(doc); >>>> >>>>count++;numberOfProperties--; >>>> >>>>}else if(count % 233 == 0){ >>>> >>>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>>> >>>>builder.randomTextGenerator( >>>> >>>>builder.randomNumberGenerator(),SearchTerms.CHAR.getTerm())); >>>> >>>>Document doc = builder.mapToDocument(child); >>>> >>>>list.add(doc); >>>> >>>>count++;numberOfProperties--; >>>> >>>>}else if(count % 71 == 0){ >>>> >>>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>>> >>>>builder.randomTextGenerator( >>>> >>>>builder.randomNumberGenerator(),SearchTerms.ARTICLE.getTerm())); >>>> >>>>Document doc = builder.mapToDocument(child); >>>> >>>>list.add(doc); >>>> >>>>count++;numberOfProperties--; >>>> >>>>}else if(count % 2237 == 0){ >>>> >>>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>>> >>>>builder.randomTextGenerator( >>>> >>>>builder.randomNumberGenerator(),SearchTerms.LUNG_CANCER.getTerm())); >>>> >>>>Document doc = builder.mapToDocument(child); >>>> >>>>list.add(doc); >>>> >>>>count++;numberOfProperties--; >>>> >>>>}else if(count % 5077 == 0){ >>>> >>>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>>> >>>>builder.randomTextGenerator( >>>> >>>>builder.randomNumberGenerator(),SearchTerms.LIVER_CARCINOMA.getTerm())) >>>>; >>>> >>>>Document doc = builder.mapToDocument(child); >>>> >>>>list.add(doc); >>>> >>>>count++;numberOfProperties--; >>>> >>>>}else if(count % 2371 == 0){ >>>> >>>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>>> >>>>builder.randomTextGeneratorStartsWith( >>>> >>>>builder.randomNumberGenerator(),SearchTerms.BLOOD.getTerm())); >>>> >>>>Document doc = builder.mapToDocumentExactMatch(child); >>>> >>>>list.add(doc); >>>> >>>>count++;numberOfProperties--; >>>> >>>>}else if(count % 79 == 0){ >>>> >>>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>>> >>>>builder.randomTextGeneratorStartsWith( >>>> >>>>builder.randomNumberGenerator(),SearchTerms.ARTICLE.getTerm())); >>>> >>>>Document doc = builder.mapToDocumentExactMatch(child); >>>> >>>>list.add(doc); >>>> >>>>count++;numberOfProperties--; >>>> >>>>}else if(count % 3581 == 0){ >>>> >>>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>>> >>>>builder.randomTextGeneratorStartsWith( >>>> >>>>builder.randomNumberGenerator(),SearchTerms.LUNG_CANCER.getTerm())); >>>> >>>>Document doc = builder.mapToDocumentExactMatch(child); >>>> >>>>list.add(doc); >>>> >>>>count++;numberOfProperties--; >>>> >>>>}else if(count % 23 == 0){ >>>> >>>>ChildDocObject child = builder.generateChildDocWithSalt(parent, >>>> >>>>builder.randomTextGeneratorStartsWith( >>>> >>>>builder.randomNumberGenerator(),SearchTerms.CHAR.getTerm())); >>>> >>>>Document doc = builder.mapToDocumentExactMatch(child); >>>> >>>>list.add(doc); >>>> >>>>count++;numberOfProperties--; >>>> >>>>} else { >>>> >>>>ChildDocObject child = builder.generateChildDoc(parent); >>>> >>>>Document doc = builder.mapToDocument(child); >>>> >>>>list.add(doc); >>>> >>>>count++; >>>> >>>>numberOfProperties--; >>>> >>>>} >>>> >>>>} >>>> >>>>} >>>> >>>>Document par = builder.mapToDocument(parent); >>>> >>>>list.add(par); >>>> >>>>writer.addDocuments(list); >>>> >>>>} >>>> >>>> >>>>Which works pretty well until I scale it up using several instances of >>>>this. When the nextChildDoc document retrieved gets to id 5874902 the >>>>line in ToParentBlockJoinQuery >>>> >>>> >>>> parentDoc = parentBits.nextSetBit(nextChildDoc); >>>> >>>> >>>>Gives the value 2147483647 to the parentDoc, which is not a document >>>>id >>>>in my index if I understand lucene and Luke correctly since my index >>>>has >>>>only 42716877 documents. >>>> >>>>Can someone shed some light on this exception? >>>> >>>> >>>>Thanks, >>>> >>>>Scott Bauer >>>> >>>> >>>> >>>> >>> >>> >>>--------------------------------------------------------------------- >>>To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >> >