Sent this once already but it never made it to the list.  Checked apache mail archives to make sure it wasn't just me.  It's not there.

On 9/21/22 07:10, Jan Høydahl wrote:
Since 9.0 Solr can start with an empty SOLR_HOME as it will use defaults in their place <https://solr.apache.org/guide/solr/9_0/upgrade-notes/major-changes-in-solr-9.html>: "Solr no longer requires a solr.xml in $SOLR_HOME. If one is not found, Solr will instead use the default one from $SOLR_TIP/server/solr/solr.xml."

You can see how much I keep up with things, didn't know that was already handled.  I haven't used Solr professionally for a few years now, which means there is little time during work hours to keep close track of Solr's progress.  Thank you for letting me know.  I think that means I can delete solr.xml from my tiny Solr install.

I agree with David's comment on the JIRA that this is also a question about what we want solr.xml to be conceptually - a node-config file, or a cluster-config file. We have other locations for cluster-wide configuration. Today I think solr.xml is a mix of the two. A value like "zkClientTimeout" could be cluster-wide while "host" and "hostPort" are of course node-local.

Topics for different threads below, and I am aware that what I am describing is a TON of work.  I would help with it as much as I can.

I think the entire configuration system needs a revamp.

1) Choose one format for all configs.  Currently it is a mix of xml, properties, and json.  I really like the compactness of json, but the official standard does not support comments, and we use those extensively in the out-of-box xml configs.  Many of the libraries that parse json do have comment support, but I worry about relying on nonstandard extensions.  Related:  For JSON support, let's decide whether we are using jackson or noggit and remove the other one.  I suspect that some of our other dependencies depend on jackson, which may make the choice for us. Can jackson be used for the other things we do with XML?  I would like to reduce how many dependencies we have, make the download smaller.

2) Make sure the entire config system follows good inheritance rules.  Cluster config takes effect unless node config overrides, and so on.  Cluster config probably only applies to cloud mode, so it should be in ZK, and completely configurable in the admin UI. Default node config in cloud mode should probably actually be part of cluster config, and we could even have node-specific overrides in ZK as well so they are easy to edit centrally.  It would be very cool if there was central editing even for the things that normally go in /etc/default/solr.in.sh.

3) I think the admin UI should have an option to turn on in-UI editing of collection/core configurations, and that absolutely everything they can do is available in the UI.  Leave that feature off by default as a security measure, and have a big red security warning on the button that turns it on.

I imagine a SolrCloud world where EVERYTHING is configurable in the admin UI, with some of it turned off by default for security, where you can even change things like heap size and restart multiple Solr nodes all in the central UI.  It would be very nice if the UI even controlled ZK nodes.  As part of that, eliminating standalone mode is probably prudent.

A truly ambitious idea would be to have a full software suite that includes creating a VIP so there is automated redundancy of the central UI's IP address, with rpm and deb repos for easy install. Containers are very in right now, so create something similar with docker.

Thanks,
Shawn

Reply via email to