I’ll echo what Alessandro has said that there is a lot of ideas out there, but we need someone to really push it all forward and make “it real”.
You may be interested in https://github.com/apache/solr/pull/1999 It demos using Solr’s file store api to push a model out to a Solr node, where upon it deploys around the cluster! It has some awkwardness in it, so it needs work, but it’s a nice example of the pieces starting to appear... Eric > On Jan 24, 2024, at 5:02 AM, Alessandro Benedetti <a.benede...@sease.io> > wrote: > > Hi Ufuk, Rajani, > I think Rajani meant sentence transformer models i.e. large language models > fine-tuned for sentence similarity and capable of encoding text to vectors > (potentially with multi-modality). > The streaming expression pointers you listed refer to > regression/classification machine learning models :) > > To answer Rajani, there's no official 'roadmap' in Apache Solr. > As a project, Apache Solr is driven by the PMC (I am part of it) on a > volunteer basis. > > Then there are team's and individual contributors' roadmaps, take it as the > list of ideas and code a single contributor or team, would like to > contribute. > This will need to be peer-reviewed and in general, even if it's on the > roadmap of an individual committer, you can't just push it (I mean, > technically you can, but if it's crazy stuff it's likely going to be > removed). > > Aside from that, the integrations you mentioned are on my team roadmap: > https://sease.io/2023/10/apache-lucene-solr-ai-roadmap-do-you-want-to-make-it-happen.html > But at the moment we are waiting for sponsors to make it happen. > > Cheers > > -------------------------- > *Alessandro Benedetti* > Director @ Sease Ltd. > *Apache Lucene/Solr Committer* > *Apache Solr PMC Member* > > e-mail: a.benede...@sease.io > > > *Sease* - Information Retrieval Applied > Consulting | Training | Open Source > > Website: Sease.io <http://sease.io/> > LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter > <https://twitter.com/seaseltd> | Youtube > <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github > <https://github.com/seaseltd> > > > On Wed, 24 Jan 2024 at 08:19, uyil...@vivaldi.net.INVALID > <uyil...@vivaldi.net.invalid> wrote: > >> There's a way to produce, use and store them but it only supports a fixed >> format: >> >> https://solr.apache.org/guide/8_5/stream-source-reference.html#model >> https://solr.apache.org/guide/8_5/stream-source-reference.html#train >> https://solr.apache.org/guide/8_5/stream-decorator-reference.html#classify >> >> maybe it can be used if your model is possible to be expressed in the >> given format >> >> >> -ufuk yilmaz >> ________________________________ >> From: rajani m <rajinima...@gmail.com> >> Sent: Wednesday, January 24, 2024 12:37 AM >> To: solr-user <solr-u...@lucene.apache.org> >> Subject: ML Model Management in Solr >> >> Hi All, >> >> Is it in the road map to support trained ML model deployment in solr? >> Models that can be deployed to generate text embedding at ingest time and >> query embeddings at query time. >> >> Thanks, >> Rajani >> _______________________ Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.