Hey Solr Community,


Does anyone know if it's possible to manage the parent/child relationship for 
nested documents manually? (i.e I manage the "_root_" relationship outside of 
Solr and still take advantage of block join functionality?).

Typically nested documents are defined as follows:



In the below circumstance the "_root_" relationship is managed by Solr and 
would result in 4 documents indexed (parent + 3 children)



{

              "id": "parent_id",

              "content_type": "parent",

              "Text_Blob": ["Some Large Text Field Here"],

              "_childDocuments_": [{

                             "id": "child_id_1",

                             "content_type": "child",

                             ...more child fields



              },{

                             "id": "child_id_2",

                             "content_type": "child",

                             ...more child fields



              },{

                             "id": "child_id_3",

                             "content_type": "child",

                             ...more child fields



              }]

}



The reason for the use case is we have a large text field we don't want to 
repeat at child level in the document but in some circumstances we have many 
child records meaning our

"_childDocuments_" array can get very big forcing us to build large individual 
Solr documents when performing an update.  We've found it much more performant 
sending smaller documents across to Solr with the relationship managed outside 
of Solr



i.e. Using the example above



document 1 - parent

{

              "id": "parent_id",

              "_root_":"parent_id",

              "content_type": "parent",

              "Text_Blob": ["Some Large Text Field Here"]

}



document 2,3,4 - three children all related by their "_root_" element but 
managed external to Solr



{

              "id": "child_id_1",

              "_root_": "parent_id",

              "content_type": "child",

              ...morechildfields

}{

              "id": "child_id_2",

              "_root_": "parent_id",

              "content_type": "child",

              ...morechildfields

}{

              "id": "child_id_3",

              "_root_": "parent_id",

              "content_type": "child",

              ...morechildfields

}



All of the above records could be batched as individual "documents" in JSON 
Array format and pushed to Solr.   The problem with the above approach is when 
you specify the "_root_" element on an update Solr assumes it's an atomic 
update and overrides the previous document (i.e. only the last document remains 
in the index).

Is there a way to manage the parent/child relationship outside of Solr without 
us creating a large "_childDocuments_" array?



Any advice would be appreciated.



Thanks,



Dwane

Reply via email to