Hi,

Thanks, I knew about that one, but note the date : 2012. A lot has changed in 
the meantime, specifically block joins and improved nested documents support. 
But I wonder about what can be expressed, even if we leave performance issues 
aside for another post. I can't find a comprehensive example matching my own 
anywhere. Let's hope some brave soul steps in !

On 2021/04/22 19:54:38, Eran Buchnick <buchni...@gmail.com> wrote: 
> Read this before going deeper
> https://lucidworks.com/post/solr-and-joins/
> 
> Good luck
> 
> On Wed, 21 Apr 2021, 13:55 Alain Rogister, <alain.rogis...@gmail.com> wrote:
> 
> > Hi,
> >
> > Imagine that you have been tasked with designing a Solr-based system that
> > indexes and queries data aggregated from several relational databases. All
> > the data items are ultimately related to a common object type, so the
> > logical data schema looks like :
> >
> > Customer
> >      Profile
> >      Events
> >      Interactions
> >            Steps
> >      ActionPlans
> >            Objectives
> >                     Actions
> > (and quite a few more, with up to 3 or 4 levels of nesting)
> >
> > Each of these has a number of attributes. What you need to do is the Solr
> > equivalent of a SQL query that would find all Customers where
> > Profile.item1='...' and there is some Event where event.item1='..' and
> > event.item2='..' and there is some ActionPlan that has an Objective where
> > Objective.item3='..' and this Objective has an Action where
> > Action.item1='..' and Action.item2='..'
> >
> > This is fairly trivial in a relational DB. But is it achievable in Solr ?
> > Needless to say, there is no way a flat schema can work here. So I have
> > assumed that nested documents were the only way to go. I have created a
> > schema that includes these prerequisites, where "docType" is meant to
> > contain the document type i.e. Profile, Event, Interaction etc :
> >
> >  <field name="_version_" type="plong" indexed="false" stored="false"/>
> >  <field name="_nest_path_" type="_nest_path_" indexed="true" stored="true"
> > docValues="true"/>
> >  <fieldType name="_nest_path_" class="solr.NestPathField" />
> >  <field name="_nest_parent_" type="string" indexed="true" stored="true"
> > docValues="true"/>
> >  <field name="_root_" type="string" indexed="true" stored="true"
> > docValues="true" />
> >  <field name="docType" type="string" indexed="true" stored="true"
> > docValues="true" required="true"/>
> >  <field name="id" type="string" indexed="true" stored="true"
> > docValues="true"/>
> >  <uniqueKey>id</uniqueKey>
> >
> > and I have used naming conventions on the other fields to reflect the
> > structure, e.g.
> >
> > <field name="customerId" type="string" indexed="true" stored="true"
> > docValues="true"/>
> > <field name="customerName" type="string" indexed="true" stored="true"
> > docValues="true"/>
> > <field name="interaction.code" type="string" indexed="true" stored="true"
> > docValues="true"/>
> > <field name="interaction.result" type="string" indexed="true"
> > stored="true" docValues="true"/>
> > <field name="step.code" type="string" indexed="true" stored="true"
> > docValues="true"/>
> > <field name="step.result" type="string" indexed="true" stored="true"
> > docValues="true"/>
> > <field name="event.type" type="string" indexed="true" stored="true"
> > docValues="true"/>
> > <field name="event.date" type="pdate" indexed="true" stored="true"
> > docValues="true"/>
> >
> > I have been able to index my documents correctly, i.e. retrieving a
> > Customer by id with fl=*,[child limit=999] returns the customer and all
> > dependent objects in the correct hierarchy.
> >
> > I have also been able to express queries on two levels, such as: find all
> > Customers that have an Interaction with code='..' and result='..' and that
> > have an Event with type='..' and date before '...'. The syntax looks like
> > this, using two filter queries and block joins :
> >
> > q= docType:Customer AND (<simple filtering clauses>)
> > fq= {!parent which=docType:Customer} +docType:Interaction
> > +interaction.code:C01 +interaction.result:COMPLETE
> > fq= {!parent which=docType:Customer} +docType:Event +event.type:EVTTYPE1
> > +event.date:[NOW-1YEAR TO NOW]
> >
> > That seems to return correct results. However, I can't quite figure out
> > the proper way to take it one level deeper e.g. add filtering conditions on
> > the Steps of the Interactions selected by the first filter query.
> >
> > For the record, I am able to do so using streaming expressions, but these
> > bring their own complexities so I am trying to find out first if something
> > simpler can save me.
> >
> > Thanks !
> >
> >
> 

Reply via email to