Hi Marc, There are two concerns to consider: - what you need to have in search results. If you need to have unique files, nested is a way to go. If many versions of the same file appear fine, so denormalize them. FieldCollapsing, Grouping and query time join provides some flexibility but requires verification. - cost of updates. If new versions appears often and a file has many of them, reindex amplification may put too much burden. So, prototype and test whether it can meet your SLA. Caveat Emptor.
On Mon, May 26, 2025 at 5:43 PM Marc <sedsh...@busy-byte.org> wrote: > Hi there, > > I am working on a new Solr installation for our application. Here, wer > want to store file information in file documents and file versions in > fileversion documents. All file versions are related to file documents > therefore the question arises whether it would be a good idea to store > the file versions as child documents of the file documents? This way, > certain file properties like owner of ACLs would not need to be copied > to all the file version documents, but this infor would only exist in > the parent (file) document. > > In a few Solr books I read the denormalization of the data is the most > common way to store data in Solr. Would mean that it would be more > sensible to store the file versions as stand-alone documents, not in a > nested manner, but have them side-by-side with a file-documents and add > owner and ACL entries to each file version? > > Are there any best practice guides on this sort of design decision? > > Any hints would be appreciated. > > Thanks, > Marc > -- Sincerely yours Mikhail Khludnev