We are using Solr 8.9.0. We have configured Solr cloud like 2 shards and each 
shard has one replica. We have used 5 zoo keepers for Solr cloud.
We have created 2 collections 1) forms 2) actionscomments
Schema field of forms
<field name="id" type="string" indexed="true" stored="true" required="true" 
multiValued="false" docValues="true"/>
<field name="form_id" type="tint" indexed="true" stored="true" 
docValues="true"/>
<field name="form_title" type="string" indexed="true" stored="true" 
omitNorms="true" omitTermFreqAndPositions="true"/>
<field name="form_creation_date" type="date" indexed="true" stored="true" 
docValues="true"/>
other fields ...

Schema field of actionscomments
<field name="id" type="string" indexed="true" stored="true" required="true" 
multiValued="false" docValues="true"/>
<field name="project_id" type="tint" indexed="true" stored="true" 
docValues="true"/>
<field name="form_id" type="tint" indexed="true" stored="true" 
docValues="true"/>
<field name="action_id" type="tint" indexed="true" stored="true"/>
<field name="action_date" type="date" indexed="true" stored="true" 
docValues="true"/>
<field name="action_name" type="text_string" indexed="true" stored="true" 
omitNorms="true" omitTermFreqAndPositions="true"/>
<field name="user" type="tint" indexed="true" stored="true"/>
other fields ...

Here one to many relationship between forms and actionscomments collections.
There are 30 million records in forms collection and 700 million records in 
actionscomments. We are showing columns 
form_title,form_creation_date,action_name in form listing. We have given 
sorting in all columns.
If any user set sort on action_name then we are getting records as below
- First get all form_ids of that user from actionscomments collection with 
sorting of action_name.
- Appending above form ids with boosting in query to get the forms data.

Please note that we set score to 1 for all documents while indexing hence we 
used boost query.

Example:

actionscomments:
Query: user:(713073) AND entity_type:(1) AND project_id:(2076645) sort 
field=action_name

Query: ((doctype:Apps AND appType:1 AND is_formtype_active:true AND 
(is_draft:false OR msg_type_id:1) AND instance_group_id:(2313622) AND 
project_id:(2076645))) AND msg_id:(10130987^2 10139990^1 *)^20

It is taking more time to return the result.

Is there any approach where we can get join and sort between two collection in 
cloud mode?

Reply via email to