Thanks for the reply, Both options you mentioned may work under certain circumstances. In the current scenario, giving the application (Flink jobs) that needs the collection the responsibility would not work without additional workarounds that handle synchronization of multiple instances for lets say the creation of the collections. Adding an init container and updating it every time a collection is added, removed or updated seems like the only option that could work, but will add additional modules into a Kubernetes Solr deployment that require maintenance too.
The SIP-18 seems the most promising solution, but with probably no predictable public availability. For clarification, would the bootstrapping via ConfigMaps be similar to the instructions of Placement of core properties <https://solr.apache.org/guide/solr/latest/configuration-guide/core-discovery.html#placement-of-core-properties>, just for collections? Because that would be more than optimal. The rejected alternatives in SIP-18 (custom CRDs) would also be a great option, but only under the circumstances that Zookeeper would be deprecated and removed in Kubernetes deployments (like in Kafka with KRaft). Are there any further discussions I could track for updates besides those in SOLR-16739 <https://issues.apache.org/jira/browse/SOLR-16739>? Best regards, Christos On Mon, May 27, 2024 at 9:06 PM Jan Høydahl <jan....@cominvent.com> wrote: > Hi > > You are correct that you'll use the ordinary Solr APIs to provision > collections. > > Normally I'd recommend that the application that needs the collection > should also have the responsibility of bootstrapping its config set and > collection. Another popular option is to have an init container with the > config and a script that bootstraps it. > > In SIP-18 it is proposed a solr module that allows you to bootstrap config > set from ConfigMap. See https://issues.apache.org/jira/browse/SOLR-16739 > However, that effort seems to have stalled. > > Jan > > > 27. mai 2024 kl. 07:47 skrev Christos Malliaridis < > c.malliari...@gmail.com>: > > > > Hello everyone, > > > > I am a new adopter of Solr and I am working on a Kubernetes cluster setup > > where I run a helm chart for installing and configuring multiple > components > > of a backend, including Apache Kafka, Flink and Solr. > > > > Most components come with their own operator and CRDs which I install > with > > a script and then add resources via a custom helm chart and the provided > > operator CRDs. > > > > For Solr I decided to use the Solr operator and use the SolrCloud CRD in > my > > helm chart to setup and configure SolrCloud. However, for setting up > > collections in Solr in order to start sending documents I can only find > > manual/interactive options, either by using a solr script or by using the > > Admin UI. > > > > By looking for a solution that allows me to configure a collection via my > > helm chart / CRDs, I saw that the SolrCollection CRD was removed very > early > > on to keep a single-source-of-truth. > > > > What is the current way to automatically configure collections in a setup > > like above? Is it init containers that run curl commands / solr scripts? > > > > Thanks in advance, > > Christos > >