Hi, Dwane. Breaking blocks are not possible. Underneath it's called block join. A parent doc must go together with children. You can experiment with query time join {!join}, which provides more indexing flexibility trading off query performance. Thanks, Mikhail
On Thu, Sep 8, 2022 at 4:48 AM Dwane Hall <dwaneh...@hotmail.com> wrote: > Hey Solr Community, > > > > Does anyone know if it's possible to manage the parent/child relationship > for nested documents manually? (i.e I manage the "_root_" relationship > outside of Solr and still take advantage of block join functionality?). > > Typically nested documents are defined as follows: > > > > In the below circumstance the "_root_" relationship is managed by Solr and > would result in 4 documents indexed (parent + 3 children) > > > > { > > "id": "parent_id", > > "content_type": "parent", > > "Text_Blob": ["Some Large Text Field Here"], > > "_childDocuments_": [{ > > "id": "child_id_1", > > "content_type": "child", > > ...more child fields > > > > },{ > > "id": "child_id_2", > > "content_type": "child", > > ...more child fields > > > > },{ > > "id": "child_id_3", > > "content_type": "child", > > ...more child fields > > > > }] > > } > > > > The reason for the use case is we have a large text field we don't want to > repeat at child level in the document but in some circumstances we have > many child records meaning our > > "_childDocuments_" array can get very big forcing us to build large > individual Solr documents when performing an update. We've found it much > more performant sending smaller documents across to Solr with the > relationship managed outside of Solr > > > > i.e. Using the example above > > > > document 1 - parent > > { > > "id": "parent_id", > > "_root_":"parent_id", > > "content_type": "parent", > > "Text_Blob": ["Some Large Text Field Here"] > > } > > > > document 2,3,4 - three children all related by their "_root_" element but > managed external to Solr > > > > { > > "id": "child_id_1", > > "_root_": "parent_id", > > "content_type": "child", > > ...morechildfields > > }{ > > "id": "child_id_2", > > "_root_": "parent_id", > > "content_type": "child", > > ...morechildfields > > }{ > > "id": "child_id_3", > > "_root_": "parent_id", > > "content_type": "child", > > ...morechildfields > > } > > > > All of the above records could be batched as individual "documents" in > JSON Array format and pushed to Solr. The problem with the above approach > is when you specify the "_root_" element on an update Solr assumes it's an > atomic update and overrides the previous document (i.e. only the last > document remains in the index). > > Is there a way to manage the parent/child relationship outside of Solr > without us creating a large "_childDocuments_" array? > > > > Any advice would be appreciated. > > > > Thanks, > > > > Dwane > > -- Sincerely yours Mikhail Khludnev