I’m guessing this issue may be related to the SOLR error described here:
https://issues.apache.org/jira/browse/SOLR-7606.  I can find at least one
group of documents with a missing parent in my generated index.  This
doesn’t explain why I didn’t see a similar issue in 4.10.4.  I can see
that the BitSet implementation isn’t the issue but the filtered bit set
inside it may be causing the problem given a missing parent.  I have to
say I’m a little concerned about the lack of feedback on this list.  Is
there another forum that is a little more active on this subject or is the
block join implementation just not used or supported that much?

On 6/22/15, 2:21 PM, "Bauer, Herbert S. (Scott)" <bauer.sc...@mayo.edu>
wrote:

>Well it’s clear that this is just giving a return value of
>Integer.MAX_VALUE for the parentDoc.  Given the recent changes noted here:
> https://issues.apache.org/jira/browse/LUCENE-6021 where FixedBitSet now
>returns Integer.MAX_VALUE instead of -1 I wonder if a bug wasn’t
>introduced to the BlockJoinScorer.nextDoc method.  Unfortunately I have
>yet to come up with an example to make this fail on a smaller test index.
>The child document in question does have a parent, which is doc #4823684,
>so I’m confused as to how the NO_MORE_DOCS value would be applied.  Is
>there something obvious I’m missing here?
>
>On 6/5/15, 12:05 PM, "Bauer, Herbert S. (Scott)" <bauer.sc...@mayo.edu>
>wrote:
>
>>One correction, it looks like the parentBits call has 4823680 passed to
>>it
>>to generate the erroneous docId.
>>
>>On 6/5/15, 10:34 AM, "Bauer, Herbert S. (Scott)" <bauer.sc...@mayo.edu>
>>wrote:
>>
>>>I should mention that this worked in 4.10.4 using a very similar code
>>>base.  -scott
>>>
>>>On 6/4/15, 4:51 PM, "Bauer, Herbert S. (Scott)" <bauer.sc...@mayo.edu>
>>>wrote:
>>>
>>>>I¹m working with Lucene  5.1 to try to make use of the relational
>>>>structure of the block join index and query mechanisms.  I¹m querying
>>>>with the following code:
>>>>
>>>>IndexReader reader =  DirectoryReader.open(index);
>>>>
>>>>ToParentBlockJoinIndexSearcher searcher = new
>>>>ToParentBlockJoinIndexSearcher(reader);
>>>>
>>>>ToParentBlockJoinCollector collector = new
>>>>ToParentBlockJoinCollector(Sort.RELEVANCE, 2, true, true);
>>>>
>>>>BitDocIdSetFilter codingScheme = new BitDocIdSetCachingWrapperFilter(
>>>>
>>>>                  new QueryWrapperFilter(new
>>>>QueryParser("codingSchemeName", new StandardAnalyzer(new CharArraySet(
>>>>0,
>>>>true))).parse(scheme.getCodingSchemeName())));
>>>>
>>>>  Query query = new QueryParser(null, new StandardAnalyzer(new
>>>>CharArraySet( 0, true))).createBooleanQuery("propertyValue",
>>>>term.getTerm(), Occur.MUST);
>>>>
>>>>  ToParentBlockJoinQuery termJoinQuery = new ToParentBlockJoinQuery(
>>>>
>>>>    query,
>>>>
>>>>    codingScheme,
>>>>
>>>>    ScoreMode.Avg);
>>>>
>>>>  searcher.search(termJoinQuery, collector);
>>>>
>>>>
>>>>To try to get parent values, but it fails on the final line with the
>>>>following stack trace:
>>>>
>>>>
>>>>Exception in thread "main" java.lang.IllegalStateException: child query
>>>>must only match non-parent docs, but parent docID=2147483647 matched
>>>>childScorer=class org.apache.lucene.search.TermScorer
>>>>
>>>>at 
>>>>org.apache.lucene.search.join.ToParentBlockJoinQuery$BlockJoinScorer.ne
>>>>x
>>>>t
>>>>D
>>>>oc(ToParentBlockJoinQuery.java:330)
>>>>
>>>>at 
>>>>org.apache.lucene.search.join.ToParentBlockJoinIndexSearcher.search(ToP
>>>>a
>>>>r
>>>>e
>>>>ntBlockJoinIndexSearcher.java:63)
>>>>
>>>>at 
>>>>org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:428)
>>>>
>>>>at 
>>>>org.lexevs.lucene.prototype.LuceneQueryTrial.luceneToParentJoinQuery(Lu
>>>>c
>>>>e
>>>>n
>>>>eQueryTrial.java:78)
>>>>
>>>>at 
>>>>org.lexevs.lucene.prototype.LuceneQueryTrial.main(LuceneQueryTrial.java
>>>>:
>>>>3
>>>>2
>>>>7)
>>>>
>>>>
>>>>I build indexes up to about 36Gb using a code similar to the following:
>>>>
>>>>
>>>>List<Document> list = new ArrayList<Document>();
>>>>
>>>>//need a static
>>>>
>>>>int staticCount = count;
>>>>
>>>>ParentDocObject parent =
>>>>builder.generateParentDoc(cs.getCodingSchemeName(),
>>>>
>>>>cs.getVersion(), cs.getURI(), "description");
>>>>
>>>>if 
>>>>(cs.codingSchemeName.equals(CodingScheme.THESSCHEME.codingSchemeName))
>>>>{
>>>>
>>>>//One per coding Scheme
>>>>
>>>>int numberOfProperties = 12;
>>>>
>>>>if(!thesExactMatchDone){
>>>>
>>>>ChildDocObject child1 =
>>>>builder.generateChildDocWithSalt(parent,SearchTerms.BLOOD.getTerm());
>>>>
>>>>Document doc1 = builder.mapToDocumentExactMatch(child1);
>>>>
>>>>list.add(doc1);
>>>>
>>>>count++;
>>>>
>>>>numberOfProperties--;
>>>>
>>>>ChildDocObject child =
>>>>builder.generateChildDocWithSalt(parent,SearchTerms.CHAR.term);
>>>>
>>>>Document doc = builder.mapToDocumentExactMatch(child);
>>>>
>>>>count++;
>>>>
>>>>list.add(doc);
>>>>
>>>>numberOfProperties--;
>>>>
>>>>thesExactMatchDone = true;
>>>>
>>>>}
>>>>
>>>>while (numberOfProperties > 0) {
>>>>
>>>>if(count % 547 == 0){
>>>>
>>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>>
>>>>builder.randomTextGenerator(
>>>>
>>>>builder.randomNumberGenerator(),SearchTerms.BLOOD.getTerm()));
>>>>
>>>>Document doc = builder.mapToDocument(child);
>>>>
>>>>list.add(doc);
>>>>
>>>>count++;numberOfProperties--;
>>>>
>>>>}else if(count % 233 == 0){
>>>>
>>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>>
>>>>builder.randomTextGenerator(
>>>>
>>>>builder.randomNumberGenerator(),SearchTerms.CHAR.getTerm()));
>>>>
>>>>Document doc = builder.mapToDocument(child);
>>>>
>>>>list.add(doc);
>>>>
>>>>count++;numberOfProperties--;
>>>>
>>>>}else if(count % 71 == 0){
>>>>
>>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>>
>>>>builder.randomTextGenerator(
>>>>
>>>>builder.randomNumberGenerator(),SearchTerms.ARTICLE.getTerm()));
>>>>
>>>>Document doc = builder.mapToDocument(child);
>>>>
>>>>list.add(doc);
>>>>
>>>>count++;numberOfProperties--;
>>>>
>>>>}else if(count % 2237 == 0){
>>>>
>>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>>
>>>>builder.randomTextGenerator(
>>>>
>>>>builder.randomNumberGenerator(),SearchTerms.LUNG_CANCER.getTerm()));
>>>>
>>>>Document doc = builder.mapToDocument(child);
>>>>
>>>>list.add(doc);
>>>>
>>>>count++;numberOfProperties--;
>>>>
>>>>}else if(count % 5077 == 0){
>>>>
>>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>>
>>>>builder.randomTextGenerator(
>>>>
>>>>builder.randomNumberGenerator(),SearchTerms.LIVER_CARCINOMA.getTerm()))
>>>>;
>>>>
>>>>Document doc = builder.mapToDocument(child);
>>>>
>>>>list.add(doc);
>>>>
>>>>count++;numberOfProperties--;
>>>>
>>>>}else if(count % 2371 == 0){
>>>>
>>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>>
>>>>builder.randomTextGeneratorStartsWith(
>>>>
>>>>builder.randomNumberGenerator(),SearchTerms.BLOOD.getTerm()));
>>>>
>>>>Document doc = builder.mapToDocumentExactMatch(child);
>>>>
>>>>list.add(doc);
>>>>
>>>>count++;numberOfProperties--;
>>>>
>>>>}else if(count % 79 == 0){
>>>>
>>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>>
>>>>builder.randomTextGeneratorStartsWith(
>>>>
>>>>builder.randomNumberGenerator(),SearchTerms.ARTICLE.getTerm()));
>>>>
>>>>Document doc = builder.mapToDocumentExactMatch(child);
>>>>
>>>>list.add(doc);
>>>>
>>>>count++;numberOfProperties--;
>>>>
>>>>}else if(count % 3581 == 0){
>>>>
>>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>>
>>>>builder.randomTextGeneratorStartsWith(
>>>>
>>>>builder.randomNumberGenerator(),SearchTerms.LUNG_CANCER.getTerm()));
>>>>
>>>>Document doc = builder.mapToDocumentExactMatch(child);
>>>>
>>>>list.add(doc);
>>>>
>>>>count++;numberOfProperties--;
>>>>
>>>>}else if(count % 23 == 0){
>>>>
>>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>>
>>>>builder.randomTextGeneratorStartsWith(
>>>>
>>>>builder.randomNumberGenerator(),SearchTerms.CHAR.getTerm()));
>>>>
>>>>Document doc = builder.mapToDocumentExactMatch(child);
>>>>
>>>>list.add(doc);
>>>>
>>>>count++;numberOfProperties--;
>>>>
>>>>} else {
>>>>
>>>>ChildDocObject child = builder.generateChildDoc(parent);
>>>>
>>>>Document doc = builder.mapToDocument(child);
>>>>
>>>>list.add(doc);
>>>>
>>>>count++;
>>>>
>>>>numberOfProperties--;
>>>>
>>>>}
>>>>
>>>>}
>>>>
>>>>}
>>>>
>>>>Document par = builder.mapToDocument(parent);
>>>>
>>>>list.add(par);
>>>>
>>>>writer.addDocuments(list);
>>>>
>>>>}
>>>>
>>>>
>>>>Which works pretty well until I scale it up using several instances of
>>>>this.  When the nextChildDoc document retrieved gets to id 5874902 the
>>>>line in ToParentBlockJoinQuery
>>>>
>>>>
>>>>        parentDoc = parentBits.nextSetBit(nextChildDoc);
>>>>
>>>>
>>>>Gives the value  2147483647 to the parentDoc, which is not a document
>>>>id
>>>>in my index if I understand lucene and Luke correctly since my index
>>>>has
>>>>only 42716877 documents.
>>>>
>>>>Can someone shed some light on this exception?
>>>>
>>>>
>>>>Thanks,
>>>>
>>>>Scott Bauer
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>
>

Reply via email to