I don’t like multivalued fields much because they don’t play nice with 
docValues which enable many cool features I like about Solr. They also don’t 
match indexes (find docs that have this value on the 3rd position). But don’t 
take this as a suggestion not to use them, they have their use. 

Check 
https://solr.apache.org/guide/solr/latest/indexing-guide/tokenizers.html#path-hierarchy-tokenizer
 as an alternative to see if it can help with your case

-ufuk yilmaz



—

> On Nov 21, 2024, at 22:13, Marc <sedsh...@busy-byte.org> wrote:
> 
> Hi there,
> 
> I am running an application comprising approx. 20 million documents. To make 
> these documents searchable, I decided to give SOLR a try and feed 
> meta-information about my documents into SOLR using a Python script. This all 
> works fine. My question is there less a technical one, but rather a 
> structural/strategic one.
> 
> In my document collection, documents can have parent-documents, and in 
> particular not only one parent document, but potentially several parent 
> documents. Each document is identified uniquely by an ID value. Child 
> documents refer to their parents using a multi-value field 'parent' which 
> holds the parent's ID values.
> 
> I am interested in the paths that lead from leaf-documents (documents that 
> only have parents, but no further children) back to the root document. My 
> idea was to add any parent document of a child document (also those further 
> away than the immediate parents, so grandparent and grand-grand-... parent 
> documents) into this multi-value parent property.
> 
> Later, I want to be able to pick any document X from my 20 million documents 
> and efficiently determine the set of documents of which X is a parent. I.e., 
> all documents that have an entry of X's ID somewhere in their parent field.
> 
> * Is my strategy of structuring my documents by means of the multi-value 
> property sensible?
> * Does SOLR provide better methods (that I'm not aware of) to achieve the 
> same?
> * Will this perform properly? Or is my structuring method likely to keel over 
> at some stage if the number of documents keeps growing?
> 
> Best,
> Marc

Reply via email to