Well it’s clear that this is just giving a return value of
Integer.MAX_VALUE for the parentDoc.  Given the recent changes noted here:
 https://issues.apache.org/jira/browse/LUCENE-6021 where FixedBitSet now
returns Integer.MAX_VALUE instead of -1 I wonder if a bug wasn’t
introduced to the BlockJoinScorer.nextDoc method.  Unfortunately I have
yet to come up with an example to make this fail on a smaller test index.
The child document in question does have a parent, which is doc #4823684,
so I’m confused as to how the NO_MORE_DOCS value would be applied.  Is
there something obvious I’m missing here?

On 6/5/15, 12:05 PM, "Bauer, Herbert S. (Scott)" <bauer.sc...@mayo.edu>
wrote:

>One correction, it looks like the parentBits call has 4823680 passed to it
>to generate the erroneous docId.
>
>On 6/5/15, 10:34 AM, "Bauer, Herbert S. (Scott)" <bauer.sc...@mayo.edu>
>wrote:
>
>>I should mention that this worked in 4.10.4 using a very similar code
>>base.  -scott
>>
>>On 6/4/15, 4:51 PM, "Bauer, Herbert S. (Scott)" <bauer.sc...@mayo.edu>
>>wrote:
>>
>>>I¹m working with Lucene  5.1 to try to make use of the relational
>>>structure of the block join index and query mechanisms.  I¹m querying
>>>with the following code:
>>>
>>>IndexReader reader =  DirectoryReader.open(index);
>>>
>>>ToParentBlockJoinIndexSearcher searcher = new
>>>ToParentBlockJoinIndexSearcher(reader);
>>>
>>>ToParentBlockJoinCollector collector = new
>>>ToParentBlockJoinCollector(Sort.RELEVANCE, 2, true, true);
>>>
>>>BitDocIdSetFilter codingScheme = new BitDocIdSetCachingWrapperFilter(
>>>
>>>                  new QueryWrapperFilter(new
>>>QueryParser("codingSchemeName", new StandardAnalyzer(new CharArraySet(
>>>0,
>>>true))).parse(scheme.getCodingSchemeName())));
>>>
>>>  Query query = new QueryParser(null, new StandardAnalyzer(new
>>>CharArraySet( 0, true))).createBooleanQuery("propertyValue",
>>>term.getTerm(), Occur.MUST);
>>>
>>>  ToParentBlockJoinQuery termJoinQuery = new ToParentBlockJoinQuery(
>>>
>>>    query,
>>>
>>>    codingScheme,
>>>
>>>    ScoreMode.Avg);
>>>
>>>  searcher.search(termJoinQuery, collector);
>>>
>>>
>>>To try to get parent values, but it fails on the final line with the
>>>following stack trace:
>>>
>>>
>>>Exception in thread "main" java.lang.IllegalStateException: child query
>>>must only match non-parent docs, but parent docID=2147483647 matched
>>>childScorer=class org.apache.lucene.search.TermScorer
>>>
>>>at 
>>>org.apache.lucene.search.join.ToParentBlockJoinQuery$BlockJoinScorer.nex
>>>t
>>>D
>>>oc(ToParentBlockJoinQuery.java:330)
>>>
>>>at 
>>>org.apache.lucene.search.join.ToParentBlockJoinIndexSearcher.search(ToPa
>>>r
>>>e
>>>ntBlockJoinIndexSearcher.java:63)
>>>
>>>at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:428)
>>>
>>>at 
>>>org.lexevs.lucene.prototype.LuceneQueryTrial.luceneToParentJoinQuery(Luc
>>>e
>>>n
>>>eQueryTrial.java:78)
>>>
>>>at 
>>>org.lexevs.lucene.prototype.LuceneQueryTrial.main(LuceneQueryTrial.java:
>>>3
>>>2
>>>7)
>>>
>>>
>>>I build indexes up to about 36Gb using a code similar to the following:
>>>
>>>
>>>List<Document> list = new ArrayList<Document>();
>>>
>>>//need a static
>>>
>>>int staticCount = count;
>>>
>>>ParentDocObject parent =
>>>builder.generateParentDoc(cs.getCodingSchemeName(),
>>>
>>>cs.getVersion(), cs.getURI(), "description");
>>>
>>>if 
>>>(cs.codingSchemeName.equals(CodingScheme.THESSCHEME.codingSchemeName))
>>>{
>>>
>>>//One per coding Scheme
>>>
>>>int numberOfProperties = 12;
>>>
>>>if(!thesExactMatchDone){
>>>
>>>ChildDocObject child1 =
>>>builder.generateChildDocWithSalt(parent,SearchTerms.BLOOD.getTerm());
>>>
>>>Document doc1 = builder.mapToDocumentExactMatch(child1);
>>>
>>>list.add(doc1);
>>>
>>>count++;
>>>
>>>numberOfProperties--;
>>>
>>>ChildDocObject child =
>>>builder.generateChildDocWithSalt(parent,SearchTerms.CHAR.term);
>>>
>>>Document doc = builder.mapToDocumentExactMatch(child);
>>>
>>>count++;
>>>
>>>list.add(doc);
>>>
>>>numberOfProperties--;
>>>
>>>thesExactMatchDone = true;
>>>
>>>}
>>>
>>>while (numberOfProperties > 0) {
>>>
>>>if(count % 547 == 0){
>>>
>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>
>>>builder.randomTextGenerator(
>>>
>>>builder.randomNumberGenerator(),SearchTerms.BLOOD.getTerm()));
>>>
>>>Document doc = builder.mapToDocument(child);
>>>
>>>list.add(doc);
>>>
>>>count++;numberOfProperties--;
>>>
>>>}else if(count % 233 == 0){
>>>
>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>
>>>builder.randomTextGenerator(
>>>
>>>builder.randomNumberGenerator(),SearchTerms.CHAR.getTerm()));
>>>
>>>Document doc = builder.mapToDocument(child);
>>>
>>>list.add(doc);
>>>
>>>count++;numberOfProperties--;
>>>
>>>}else if(count % 71 == 0){
>>>
>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>
>>>builder.randomTextGenerator(
>>>
>>>builder.randomNumberGenerator(),SearchTerms.ARTICLE.getTerm()));
>>>
>>>Document doc = builder.mapToDocument(child);
>>>
>>>list.add(doc);
>>>
>>>count++;numberOfProperties--;
>>>
>>>}else if(count % 2237 == 0){
>>>
>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>
>>>builder.randomTextGenerator(
>>>
>>>builder.randomNumberGenerator(),SearchTerms.LUNG_CANCER.getTerm()));
>>>
>>>Document doc = builder.mapToDocument(child);
>>>
>>>list.add(doc);
>>>
>>>count++;numberOfProperties--;
>>>
>>>}else if(count % 5077 == 0){
>>>
>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>
>>>builder.randomTextGenerator(
>>>
>>>builder.randomNumberGenerator(),SearchTerms.LIVER_CARCINOMA.getTerm()));
>>>
>>>Document doc = builder.mapToDocument(child);
>>>
>>>list.add(doc);
>>>
>>>count++;numberOfProperties--;
>>>
>>>}else if(count % 2371 == 0){
>>>
>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>
>>>builder.randomTextGeneratorStartsWith(
>>>
>>>builder.randomNumberGenerator(),SearchTerms.BLOOD.getTerm()));
>>>
>>>Document doc = builder.mapToDocumentExactMatch(child);
>>>
>>>list.add(doc);
>>>
>>>count++;numberOfProperties--;
>>>
>>>}else if(count % 79 == 0){
>>>
>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>
>>>builder.randomTextGeneratorStartsWith(
>>>
>>>builder.randomNumberGenerator(),SearchTerms.ARTICLE.getTerm()));
>>>
>>>Document doc = builder.mapToDocumentExactMatch(child);
>>>
>>>list.add(doc);
>>>
>>>count++;numberOfProperties--;
>>>
>>>}else if(count % 3581 == 0){
>>>
>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>
>>>builder.randomTextGeneratorStartsWith(
>>>
>>>builder.randomNumberGenerator(),SearchTerms.LUNG_CANCER.getTerm()));
>>>
>>>Document doc = builder.mapToDocumentExactMatch(child);
>>>
>>>list.add(doc);
>>>
>>>count++;numberOfProperties--;
>>>
>>>}else if(count % 23 == 0){
>>>
>>>ChildDocObject child = builder.generateChildDocWithSalt(parent,
>>>
>>>builder.randomTextGeneratorStartsWith(
>>>
>>>builder.randomNumberGenerator(),SearchTerms.CHAR.getTerm()));
>>>
>>>Document doc = builder.mapToDocumentExactMatch(child);
>>>
>>>list.add(doc);
>>>
>>>count++;numberOfProperties--;
>>>
>>>} else {
>>>
>>>ChildDocObject child = builder.generateChildDoc(parent);
>>>
>>>Document doc = builder.mapToDocument(child);
>>>
>>>list.add(doc);
>>>
>>>count++;
>>>
>>>numberOfProperties--;
>>>
>>>}
>>>
>>>}
>>>
>>>}
>>>
>>>Document par = builder.mapToDocument(parent);
>>>
>>>list.add(par);
>>>
>>>writer.addDocuments(list);
>>>
>>>}
>>>
>>>
>>>Which works pretty well until I scale it up using several instances of
>>>this.  When the nextChildDoc document retrieved gets to id 5874902 the
>>>line in ToParentBlockJoinQuery
>>>
>>>
>>>        parentDoc = parentBits.nextSetBit(nextChildDoc);
>>>
>>>
>>>Gives the value  2147483647 to the parentDoc, which is not a document id
>>>in my index if I understand lucene and Luke correctly since my index has
>>>only 42716877 documents.
>>>
>>>Can someone shed some light on this exception?
>>>
>>>
>>>Thanks,
>>>
>>>Scott Bauer
>>>
>>>
>>>
>>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>

Reply via email to