I don’t like multivalued fields much because they don’t play nice with docValues which enable many cool features I like about Solr. They also don’t match indexes (find docs that have this value on the 3rd position). But don’t take this as a suggestion not to use them, they have their use.
Check https://solr.apache.org/guide/solr/latest/indexing-guide/tokenizers.html#path-hierarchy-tokenizer as an alternative to see if it can help with your case -ufuk yilmaz — > On Nov 21, 2024, at 22:13, Marc <sedsh...@busy-byte.org> wrote: > > Hi there, > > I am running an application comprising approx. 20 million documents. To make > these documents searchable, I decided to give SOLR a try and feed > meta-information about my documents into SOLR using a Python script. This all > works fine. My question is there less a technical one, but rather a > structural/strategic one. > > In my document collection, documents can have parent-documents, and in > particular not only one parent document, but potentially several parent > documents. Each document is identified uniquely by an ID value. Child > documents refer to their parents using a multi-value field 'parent' which > holds the parent's ID values. > > I am interested in the paths that lead from leaf-documents (documents that > only have parents, but no further children) back to the root document. My > idea was to add any parent document of a child document (also those further > away than the immediate parents, so grandparent and grand-grand-... parent > documents) into this multi-value parent property. > > Later, I want to be able to pick any document X from my 20 million documents > and efficiently determine the set of documents of which X is a parent. I.e., > all documents that have an entry of X's ID somewhere in their parent field. > > * Is my strategy of structuring my documents by means of the multi-value > property sensible? > * Does SOLR provide better methods (that I'm not aware of) to achieve the > same? > * Will this perform properly? Or is my structuring method likely to keel over > at some stage if the number of documents keeps growing? > > Best, > Marc