On 6/14/2022 10:22 AM, Christopher Schultz wrote:
Does that mean I need to:

1. delete *:*
2. optimize
3. re-index everything

Is #2 something available via the SolrJ client, or do I have to issue a REST call for that?

This code should delete everything, commit, and optimize, all with a single request:

    public static void main(String[] args) throws SolrServerException, IOException {         HttpSolrClient.Builder builder = new HttpSolrClient.Builder("http://localhost:8983/solr";);
        HttpSolrClient solrClient = builder.build();
        String collection = "dovecot";
        UpdateRequest req2 = new UpdateRequest();
        req2.setAction(ACTION.OPTIMIZE, true, true);
        req2.setAction(ACTION.COMMIT, true, true);
        req2.deleteByQuery("*:*");
        req2.process(solrClient, collection);
    }

I tried it out.  The only thing left in the index directory when this finishes is a segments_NNN file, and from what I can see, that file has no version information in it.  If you look at that file (using something very simple like less) on an index that has segments, you will note that there are lucene version numbers in it.  So I do think that a delete all with optimize WILL create a version-clean empty index.

I'm not sure we need that kind of complexity. I'm happy with my current re-index implementation because it's very straightforward. The only downside is that if you delete-all-documents before the re-index (which should be a rare process indeed), then ... the index won't show those records that haven't yet been re-indexed. The user-search in my application is mostly administrative, so it shouldn't impact many "regular" users.

You would likely benefit from reindexing into a build core and then swapping that core with the live core.  To make that a little cleaner, I used s0_0 and s0_1 as directory names, with s0_build and s0_live as the core names ... so that the directory names did not have "build" and "live" in them, which could get very confusing where a live core will sometimes exist in a build directory and vice versa.

If you use this and have the build system also swap where it is in the "indexing new data" process when it swaps the cores, it should all proceed cleanly.

I'd prefer to only make calls via SolrJ or, if necessary, via REST. So "ensuring the files are deleted from the disk" is not really possible... the Solr server is "over there" and so I can't see the disk from my application.

I believe the SolrJ code I pasted above will be useful to you.

Thanks,
Shawn

Reply via email to