On 6/14/2022 10:22 AM, Christopher Schultz wrote:
Does that mean I need to:
1. delete *:*
2. optimize
3. re-index everything
Is #2 something available via the SolrJ client, or do I have to issue
a REST call for that?
This code should delete everything, commit, and optimize, all with a
single request:
public static void main(String[] args) throws SolrServerException,
IOException {
HttpSolrClient.Builder builder = new
HttpSolrClient.Builder("http://localhost:8983/solr");
HttpSolrClient solrClient = builder.build();
String collection = "dovecot";
UpdateRequest req2 = new UpdateRequest();
req2.setAction(ACTION.OPTIMIZE, true, true);
req2.setAction(ACTION.COMMIT, true, true);
req2.deleteByQuery("*:*");
req2.process(solrClient, collection);
}
I tried it out. The only thing left in the index directory when this
finishes is a segments_NNN file, and from what I can see, that file has
no version information in it. If you look at that file (using something
very simple like less) on an index that has segments, you will note that
there are lucene version numbers in it. So I do think that a delete all
with optimize WILL create a version-clean empty index.
I'm not sure we need that kind of complexity. I'm happy with my
current re-index implementation because it's very straightforward. The
only downside is that if you delete-all-documents before the re-index
(which should be a rare process indeed), then ... the index won't show
those records that haven't yet been re-indexed. The user-search in my
application is mostly administrative, so it shouldn't impact many
"regular" users.
You would likely benefit from reindexing into a build core and then
swapping that core with the live core. To make that a little cleaner, I
used s0_0 and s0_1 as directory names, with s0_build and s0_live as the
core names ... so that the directory names did not have "build" and
"live" in them, which could get very confusing where a live core will
sometimes exist in a build directory and vice versa.
If you use this and have the build system also swap where it is in the
"indexing new data" process when it swaps the cores, it should all
proceed cleanly.
I'd prefer to only make calls via SolrJ or, if necessary, via REST. So
"ensuring the files are deleted from the disk" is not really
possible... the Solr server is "over there" and so I can't see the
disk from my application.
I believe the SolrJ code I pasted above will be useful to you.
Thanks,
Shawn