[
https://issues.apache.org/jira/browse/LUCENE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044828#comment-13044828
]
Mark Harwood commented on LUCENE-2454:
--------------------------------------
Below are 2 example tests searching employment resumes - both using the same
optional and mandatory clauses but in subtly different ways.
Question 1 is "who has Mahout skills and preferably used them at Lucid?" while
the other question is "who has Mahout skills and preferably has been employed
by Lucid?". The questions and the answers are different. Below is the XML test
script I used to illustrate the data/queries used, define expected results and
run as an executable test.
Hopefully you can make sense of this:
{code:xml}
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="test.xsl"?>
<Test description="NestedQuery tests">
<Data>
<Index name="ResumeIndex">
<Analyzers
class="org.apache.lucene.analysis.WhitespaceAnalyzer">
</Analyzers>
<Shard name="shard1">
<!--
=============================================================== -->
<Document pk="1">
<Field name="name">grant</Field>
<Field name="docType">resume</Field>
</Document>
<!--
=============================================================== -->
<Document pk="2">
<Field
name="employer">lucid</Field>
<Field
name="docType">employment</Field>
<Field
name="skills">java lucene</Field>
</Document>
<!--
=============================================================== -->
<Document pk="3">
<Field
name="employer">somewhere else</Field>
<Field
name="docType">employment</Field>
<Field
name="skills">mahout and more mahout</Field>
</Document>
<!--
=============================================================== -->
<Document pk="4">
<Field name="name">sean</Field>
<Field name="docType">resume</Field>
</Document>
<!--
=============================================================== -->
<Document pk="5">
<Field
name="employer">foo bar</Field>
<Field
name="docType">employment</Field>
<Field
name="skills">java</Field>
</Document>
<!--
=============================================================== -->
<Document pk="6">
<Field
name="employer">some co</Field>
<Field
name="docType">employment</Field>
<Field
name="skills">mahout mahout and more mahout</Field>
</Document>
</Shard>
</Index>
</Data>
<Tests>
<Test description="Who knows Mahout and preferably used it
*while employed at Lucid*?">
<Query>
<NestedQuery>
<!-- testing properties of individual child employment
docs -->
<Query>
<BooleanQuery>
<Clause occurs="must">
<TermsQuery
fieldName="skills">mahout</TermsQuery>
</Clause>
<Clause occurs="should">
<TermsQuery
fieldName="employer">lucid</TermsQuery>
</Clause>
</BooleanQuery>
</Query>
<ParentsFilter>
<TermsFilter
fieldName="docType">resume</TermsFilter>
</ParentsFilter>
</NestedQuery>
</Query>
<ExpectedResults why="Grant's tenure at Lucid is
overlooked for scoring purposes
because it did not involve the
required Mahout. Sean has more Mahout experience">
<Result
fieldName="pk">4</Result>
<Result
fieldName="pk">1</Result>
</ExpectedResults>
</Test>
<!--
====================================================================================
-->
<Test description="Different question - who knows Mahout and
preferably has been employed by Lucid?">
<Query>
<BooleanQuery>
<Clause occurs="must">
<NestedQuery>
<!-- testing properties of one
child employment docs -->
<Query>
<TermsQuery
fieldName="skills">mahout</TermsQuery>
</Query>
<ParentsFilter>
<TermsFilter
fieldName="docType">resume</TermsFilter>
</ParentsFilter>
</NestedQuery>
</Clause>
<Clause occurs="should">
<!-- Another NestedQuery
testing properties of *potentially different* child employment docs -->
<NestedQuery>
<Query>
<TermsQuery
fieldName="employer">lucid</TermsQuery>
</Query>
<ParentsFilter>
<TermsFilter
fieldName="docType">resume</TermsFilter>
</ParentsFilter>
</NestedQuery>
</Clause>
</BooleanQuery>
</Query>
<ExpectedResults why="Grant has the required Mahout
skills plus the optional Lucid engagement">
<Result
fieldName="pk">1</Result>
<Result
fieldName="pk">4</Result>
</ExpectedResults>
</Test>
<!--
====================================================================================
-->
</Tests>
</Test>
{code}
> Nested Document query support
> -----------------------------
>
> Key: LUCENE-2454
> URL: https://issues.apache.org/jira/browse/LUCENE-2454
> Project: Lucene - Java
> Issue Type: New Feature
> Components: core/search
> Affects Versions: 3.0.2
> Reporter: Mark Harwood
> Assignee: Mark Harwood
> Priority: Minor
> Attachments: LUCENE-2454.patch, LuceneNestedDocumentSupport.zip
>
>
> A facility for querying nested documents in a Lucene index as outlined in
> http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]