The recent flag is super clever, and you can use it on other applications/situations as well. I would do that in a heartbeat assuming you can reindex your data set quickly
> On Apr 12, 2023, at 10:49 AM, Alessandro Benedetti <a.benede...@sease.io> > wrote: > > Following up on Mikhail good insights, > I would probably recommend using the More Like This Query Parser followed > by grouping/field collapsing on a field. > It should solve your problem! > > If your requirements are more advanced feel free to let us know! > > Cheers > -------------------------- > *Alessandro Benedetti* > Director @ Sease Ltd. > *Apache Lucene/Solr Committer* > *Apache Solr PMC Member* > > e-mail: a.benede...@sease.io > > > *Sease* - Information Retrieval Applied > Consulting | Training | Open Source > > Website: Sease.io <http://sease.io/> > LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter > <https://twitter.com/seaseltd> | Youtube > <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github > <https://github.com/seaseltd> > > >> On Wed, 12 Apr 2023 at 13:15, Mikhail Khludnev <m...@apache.org> wrote: >> >> Hello Tom. >> It's not clear which kind of MLT you are referring to: handler, queryparser >> or component . >> Generally there are two options for deduplication: >> - query time: filed grouping or field collapsing >> - index time: >> - mlt query might be limited to parents with titles and children might >> carry editions with dates and so one >> - or mlt query can be filtered to the recent edition only for every >> title, thus recent-flag should be set during indexing and then used by >> filter. >> >>> On Wed, Apr 12, 2023 at 1:22 PM Tom Tailor <aloras2...@gmail.com> wrote: >>> >>> Hi all >>> >>> >>> >>> I want to build a recommender using Solr MoreLikeThis. I work on >>> bibliographic data I.e. books. I have multiple records of different >>> editions of the same book. For a given book MLT returns all different >>> editions of the book this is not new content from the users point of >> view. >>> I can not deduplicate the records because the different editions are >>> relevant for other applications. >>> >>> >>> >>> Is it possible to circumvent this? I could use the books title which is >> the >>> same across all editions to filter duplicates from the MLT results >>> >>> >>> >>> Thanks for your help >>> >> >> >> -- >> Sincerely yours >> Mikhail Khludnev >> https://t.me/MUST_SEARCH >> A caveat: Cyrillic! >>