Sent this once already but it never made it to the list. Checked apache
mail archives to make sure it wasn't just me. It's not there.
On 9/21/22 07:10, Jan Høydahl wrote:
Since 9.0 Solr can start with an empty SOLR_HOME as it will use
defaults in their place
<https://solr.apache.org/guide/solr/9_0/upgrade-notes/major-changes-in-solr-9.html>:
"Solr no longer requires a solr.xml in $SOLR_HOME. If one is not
found, Solr will instead use the default one from
$SOLR_TIP/server/solr/solr.xml."
You can see how much I keep up with things, didn't know that was already
handled. I haven't used Solr professionally for a few years now, which
means there is little time during work hours to keep close track of
Solr's progress. Thank you for letting me know. I think that means I
can delete solr.xml from my tiny Solr install.
I agree with David's comment on the JIRA that this is also a question
about what we want solr.xml to be conceptually - a node-config file,
or a cluster-config file. We have other locations for cluster-wide
configuration. Today I think solr.xml is a mix of the two. A value
like "zkClientTimeout" could be cluster-wide while "host" and
"hostPort" are of course node-local.
Topics for different threads below, and I am aware that what I am
describing is a TON of work. I would help with it as much as I can.
I think the entire configuration system needs a revamp.
1) Choose one format for all configs. Currently it is a mix of xml,
properties, and json. I really like the compactness of json, but the
official standard does not support comments, and we use those
extensively in the out-of-box xml configs. Many of the libraries that
parse json do have comment support, but I worry about relying on
nonstandard extensions. Related: For JSON support, let's decide
whether we are using jackson or noggit and remove the other one. I
suspect that some of our other dependencies depend on jackson, which may
make the choice for us. Can jackson be used for the other things we do
with XML? I would like to reduce how many dependencies we have, make
the download smaller.
2) Make sure the entire config system follows good inheritance rules.
Cluster config takes effect unless node config overrides, and so on.
Cluster config probably only applies to cloud mode, so it should be in
ZK, and completely configurable in the admin UI. Default node config in
cloud mode should probably actually be part of cluster config, and we
could even have node-specific overrides in ZK as well so they are easy
to edit centrally. It would be very cool if there was central editing
even for the things that normally go in /etc/default/solr.in.sh.
3) I think the admin UI should have an option to turn on in-UI editing
of collection/core configurations, and that absolutely everything they
can do is available in the UI. Leave that feature off by default as a
security measure, and have a big red security warning on the button that
turns it on.
I imagine a SolrCloud world where EVERYTHING is configurable in the
admin UI, with some of it turned off by default for security, where you
can even change things like heap size and restart multiple Solr nodes
all in the central UI. It would be very nice if the UI even controlled
ZK nodes. As part of that, eliminating standalone mode is probably prudent.
A truly ambitious idea would be to have a full software suite that
includes creating a VIP so there is automated redundancy of the central
UI's IP address, with rpm and deb repos for easy install. Containers are
very in right now, so create something similar with docker.
Thanks,
Shawn