Shawn , Thanks for looking further into this. Although many of our Solr instances do run on Windows servers, for testing this particular reindexing program, I have been running it on Linux to get the OS variable out of the equation for now. The behavior I described in my original email occurs on Linux. After enough troubleshooting (and code reading), it seems like there is a ref count maintained internally at the Lucene level which is not going down to 0 thereby making the segments ineligible for deletion.
What is baffling is that even after the reader is closed and I am done processing all the required segments, when I issue a commit through the code, it still doesn't have any effect. Only two things help with the cleanup...i) Solr restart ii) Core reload. And unfortunately neither of these approaches are practical for my use case since I can't wait for the whole processing to finish before reclaiming the space, especially when some of the cores are 3-4 TB large. Thanks, Rahul On Sat, Sep 2, 2023 at 4:45 PM Shawn Heisey <apa...@elyograg.org> wrote: > On 9/1/23 16:30, Rahul Goswami wrote: > > Thanks for your response. To your question about locking, I am not doing > > anything explicitly here. If you are alluding to deleting the write.lock > > file and opening a new IndexWriter, I am not doing that . Only an > > IndexReader. > > > > Are you suggesting opening an IndexReader from within Solr could > interfere > > with Solr's working and in turn file deletions? I think an answer to this > > question would really help me understand what is going wrong. > > I don't know what exactly the effects are of opening just a reader with > Lucene. > > I had another thought, and then I did a little searching on my list > archive to see if I could answer a question: What OS is this on? > > Other messages you've written say that you're running on Windows. > > Windows does something that on the surface sounds like a good thing: If > a file is open in ANY mode, including read-only, Windows will not allow > that file to be deleted. > > So I think the problem here is that you've got a Lucene program keeping > those segment files open, so when the Lucene code running in Solr tries > to delete them as a normal part of commit operations, it can't. > > If you were running this on pretty much any other OS, you probably > wouldn't be having this problem. Other operating systems like Linux > allow file deletion even if the file is open elsewhere. The file will > continue to exist on the filesystem until the last program that has it > open exits or closes the file, at which time the filesystem will finish > the deletion. > > If you have to stick with Windows, then you're going to have to do > something after your program closes its reader to trigger Lucene's > auto-cleanup of segments. I believe a Solr index reload would > accomplish that. Another way might be to index a dummy document, delete > that document, and issue a commit. > > Thanks, > Shawn > >