It sounds like nobody knows. Just one last try. Up !

Tx

On 2021/04/21 10:08:16, Alain Rogister <alain.rogis...@gmail.com> wrote: 
> Hi,
> 
> Imagine that you have been tasked with designing a Solr-based system that 
> indexes and queries data aggregated from several relational databases. All 
> the data items are ultimately related to a common object type, so the logical 
> data schema looks like :
> 
> Customer
>      Profile
>      Events
>      Interactions
>            Steps
>      ActionPlans
>            Objectives
>                     Actions
> (and quite a few more, with up to 3 or 4 levels of nesting)
> 
> Each of these has a number of attributes. What you need to do is the Solr 
> equivalent of a SQL query that would find all Customers where 
> Profile.item1='...' and there is some Event where event.item1='..' and 
> event.item2='..' and there is some ActionPlan that has an Objective where 
> Objective.item3='..' and this Objective has an Action where Action.item1='..' 
> and Action.item2='..'
> 
> This is fairly trivial in a relational DB. But is it achievable in Solr ? 
> Needless to say, there is no way a flat schema can work here. So I have 
> assumed that nested documents were the only way to go. I have created a 
> schema that includes these prerequisites, where "docType" is meant to contain 
> the document type i.e. Profile, Event, Interaction etc :
> 
>  <field name="_version_" type="plong" indexed="false" stored="false"/>
>  <field name="_nest_path_" type="_nest_path_" indexed="true" stored="true" 
> docValues="true"/>
>  <fieldType name="_nest_path_" class="solr.NestPathField" />
>  <field name="_nest_parent_" type="string" indexed="true" stored="true" 
> docValues="true"/>
>  <field name="_root_" type="string" indexed="true" stored="true" 
> docValues="true" />
>  <field name="docType" type="string" indexed="true" stored="true" 
> docValues="true" required="true"/> 
>  <field name="id" type="string" indexed="true" stored="true" 
> docValues="true"/> 
>  <uniqueKey>id</uniqueKey> 
> 
> and I have used naming conventions on the other fields to reflect the 
> structure, e.g.
> 
> <field name="customerId" type="string" indexed="true" stored="true" 
> docValues="true"/>
> <field name="customerName" type="string" indexed="true" stored="true" 
> docValues="true"/>
> <field name="interaction.code" type="string" indexed="true" stored="true" 
> docValues="true"/>
> <field name="interaction.result" type="string" indexed="true" stored="true" 
> docValues="true"/>
> <field name="step.code" type="string" indexed="true" stored="true" 
> docValues="true"/>
> <field name="step.result" type="string" indexed="true" stored="true" 
> docValues="true"/>
> <field name="event.type" type="string" indexed="true" stored="true" 
> docValues="true"/>
> <field name="event.date" type="pdate" indexed="true" stored="true" 
> docValues="true"/>
> 
> I have been able to index my documents correctly, i.e. retrieving a Customer 
> by id with fl=*,[child limit=999] returns the customer and all dependent 
> objects in the correct hierarchy.
> 
> I have also been able to express queries on two levels, such as: find all 
> Customers that have an Interaction with code='..' and result='..' and that 
> have an Event with type='..' and date before '...'. The syntax looks like 
> this, using two filter queries and block joins :
> 
> q= docType:Customer AND (<simple filtering clauses>)
> fq= {!parent which=docType:Customer} +docType:Interaction 
> +interaction.code:C01 +interaction.result:COMPLETE
> fq= {!parent which=docType:Customer} +docType:Event +event.type:EVTTYPE1 
> +event.date:[NOW-1YEAR TO NOW]
> 
> That seems to return correct results. However, I can't quite figure out the 
> proper way to take it one level deeper e.g. add filtering conditions on the 
> Steps of the Interactions selected by the first filter query.
> 
> For the record, I am able to do so using streaming expressions, but these 
> bring their own complexities so I am trying to find out first if something 
> simpler can save me.
> 
> Thanks !
> 
> 

Reply via email to