I never trust a solr upgrade path with an index from one major version to another. It has to be completely recreated in my opinion with the updated schema as sometimes there may be major changes, even though it’s said you can go two versions up with the same index using the upgrade path. I’ve had to rebuild indexes that take weeks to coordinate but the mechanism was in place and ready to do. I love the idea of holding one index in a core and building the next one in a secondary core and switching the names. It’s almost seamless and has been a trusted mechanism in traditional databases for decades.
Best of luck, but you should always have a path to completely destroy and rebuild a solr index as it’s not to be trusted to be consistent, it’s not a database. I mean if you want speed it’s on an ssd, which can fail at any given moment but you want the speed, just things to consider going forward. Also you can index document’s asynchronous and fork out the indexing processes to speed it up. So something that takes four hours can be done in one if it’s forked four times etc if the solr server has the cpus and you commit wisely (don’t commit until your process is done) Hope it works, look forward to the follow up Dave > On May 20, 2023, at 1:53 PM, Shawn Heisey <apa...@elyograg.org> wrote: > > On 5/19/23 15:39, Christopher Schultz wrote: >> Please confirm the following: >> 1. Solr index is created with Solr 7.something >> 2. Solr 8.x is deployed and all is well >> 3. Index is re-built by replacing 100% of documents in the index >> 4. Solr 9.x is deployed and all is well >> Is that correct, especially #4? I'd hate to have to literally delete the >> index and re-create it, since it's supposed to be online all the time and it >> takes hours to re-index everything. > > With that sequence, you might have a problem at step 4. I am not completely > sure whether all the version 7 info is gone. It might work fine. > > Given that you're not in cloud mode, here is how I would arrange things. I > have used this before with good success: > > * Two cores. > * Directories named example_0 and example_1 > * Cores named example and example_build > > Build a new index in the example_build core and swap the cores using > CoreAdmin when the full rebuild is done. Nothing ever goes down. > > Using the _0 and _1 directory names stays true to the principle of least > surprise. Otherwise you will find yourself in a situation where the core > named "example" is housed in a directory named "example_build" because the > cores have been swapped. > > In cloud mode, I would use the alias feature. Have collections named > "example_2023.05.20" (or whatever naming convention makes sense to you), with > an alias named example that points to whichever real collection is online. > > Thanks, > Shawn