Rahul, On 2025/09/19 03:48:34 Rahul Goswami wrote: > Chris, > Thanks for taking a stab at this. If I understand correctly, the flow you > are planning on is Solr X converts all X-1 segments to X on a hot index --> > shut down server (this is very important in the context of the code > presented here due to possible race condition with commit) --> run this > script on a cold index ---> bring up the server ?
Exactly. I have now tested an index created in Solr 7, run on Solr 8 until the segments have all been migrated to Lucene 8, stopped the server, run this process to force the index from 7 to 8 and started Solr 8. Solr seems to be perfectly happy with this. Unfortunately, running Solr 9 at this point does not work: there are some classes defined as part of the index configuration that are no longer present in Solr or Lucene and so the cores fail to load. So for the time being, I'm stuck on Solr 8 but I suspect I can reconfigure the indexes to remove or replace those classes. Note that I'm doing everything in a testing environment. :) >> actually run this on a test index originally created with Solr/Lucene 7, >> but running on Solr 8. There were no issues stopping Solr 8, running this >> to force the index to move to version 8, then restarting Solr. Everything >> works as expected. > > Curious to know how you converted all segments to Solr 8 before triggering > this code here. In this particular case, I re-indexed every document I had, forcing Solr to create new segments to hold the new copies of those documents. At the end of the process, all segments were Lucene 8 and all Lucene 7 segments had been removed. > > I ask because if you didn't, it likely would still open the > index fine, but there are minVersion/version checks in several flows in > Lucene and you don't want to end up in an inconsistent state that way. It > would be quite reasonable for Lucene to see indexCreatedVersionMajor to be > X, but segments having traces of X-1 and declare the index corrupted in > some flow. Also, not correctly converting all segments to version X *in > every aspect* before the major version flip would violate the guarantee > that even in case of a breaking change, your index is indeed > lossless/uncorrupted after upgrade. Understood. My expectation was that I would use your prevent-merge process at some point in the future. This program was only expected to do the last Lucene-only part of your process without actually asking the Lucene folks to modify their APIs. > I am also not sure how I feel about the use of reflection to force set the > indexCreatedVersionMajor (a private final currently) . That _might_ be a > blocker for Solr to maintain this code (?). It would be preferable if we > could rely on APIs (effort to push for the same underway at > https://github.com/apache/lucene/pull/14607 ). Personally, I would like to > see this materialize this way. Sure, but in the meantime... > Meanwhile I do believe there is value in exploring a fallback to doing this > entirely within Solr. Mike McCandless's suggestion at the talk to open a > new IndexWriter in a parallel data directory and copy over the SegmentInfo > list from the older SegmentInfos seems like a reasonable option in > that direction. If you have the bandwidth, would you be willing to give > that a try? How well would that work with a hot index? Probably documents that had already hit the disk would be fine, but then ... I see lots of race conditions and potential pitfalls. -chris
