To explain my question, first some domain background. We have a search engine 
where users can search for materials they can borrow at their local library.

Our top level documents are *works*. An example of a work could be "Harry 
Potter and the Philosopher's Stone". Examples of information stored at this 
level could be the title, the author of the work, and a genre.

At the second level, we have *manifestations" (we call these "pids"). It might 
be that a work exists as a physical book, an ebook, as an audiobook on CDs, an 
online audiobook, and there might be several editions of a book. Information 
stored at this level includes material type, year of publication, contributors 
(can be narrators, artists that have illustrated in a particular edition).

At the third level, we have *instances*. This includes information about the 
physical books, and in which libraries they are located, which department, and 
even down to locations within departments, if they are currently on loan, on 
the shelf.

Each document has a `doc_type` (which is either work, pid, or instance), works 
have a list of pids, and pids have a list of instances associated with them.

Our job is to formulate solr queries on behalf of users that belong to their 
local library, so that they can search for materials that is available to them. 
Given a query, we want to return works, along with the manifestations that 
match the query. A query can specify restrictions at all three levels; you 
might be interested in the (physical) book from last year written by Jussi 
Adler-Olsen, and it should be available at the local branch of the community 
library.

The way we find the appropriate works is pretty much in place. We use the 
`/query` endpoint of solr, and we formulate a json object where

* the `query` field contains the restrictions at the work level, something like 
`work.creator:'Jussi Adler-Olsen'`.
* To restrict to works where manifestations/pids apply to the restrictions at 
that level, we use a "parent which" construction in the `filter` part of the 
solr query. Something like `{!parent 
which='doc_type:work'}(pid.material_type:book AND  pid.year:(2021))`.
* To restrict to works where we can find a physical copy at the local library, 
we add another element to the `filter`. Something like `{!parent 
which='doc_type:work'}(instance.agency:900004 AND 
instance.status:\"onShelf\")`, where 900004 is the id of the local library.

That seems to work well. We get the works we are interested in. The question I 
have is, how do I restrict the manifestations we return? We use the field list 
and a `childFilter` to restrict manifestations, something like this: `"fields": 
"work.workid work.title work.creator, pids, id, pid.year, pid.material_type 
[child childFilter='pid.material_type:bog' limit=-1]"`. That part of the 
filtering also seems to work OK, but we get all the manifestations that match, 
from all libraries. We want to restrict to those manifestations, where the 
local library has a copy.

In other words, (I guess) we need to formulate a restriction in the `[child 
childFilter=...]` part of the field list, restricting the second-level 
documents on information stored at the third level. I am not sure how to do 
that. Can anyone help?

Thanks a lot in advance, and best regards.

/Noah


--

Noah Torp-Smith (n...@dbc.dk)

Reply via email to