Not sure if it's relevant to your needs, but you could look at https://fair-trec.github.io/ (TREC)
Karen R. Harker, MLS, MPH Collection Assessment Librarian UNT Libraries -----Original Message----- From: Code for Libraries <CODE4LIB@LISTS.CLIR.ORG> On Behalf Of Ohms, Jannis Sent: Wednesday, July 12, 2023 11:41 AM To: CODE4LIB@LISTS.CLIR.ORG Subject: [EXT] [CODE4LIB] How to algorithmicaly evaluate a ranking function ? [Einige Personen, die diese Nachricht erhalten haben, erhalten häufig keine E-Mails von j.o...@tu-braunschweig.de. Weitere Informationen, warum dies wichtig ist, finden Sie unter https://aka.ms/LearnAboutSenderIdentification ] Hi all, I want to evaluate the ranking of my discovery system to tune the ranking function. are there datasets or benchmarks I can use? i.e. a list of queries and the ranked results? I want to evaluate different functions and weights for this an automated repeatable approach that does not require user tests for every run would be nice Thanks for your help Jannis