On 3/13/23 05:59, Steffen Moldenhauer wrote:
esp. the recursive chown seems to be the cause.  We mitigated it a bit by 
cleaning up the backup volume regularly. But there are deployments with a 
larger number of collections (~400) and then it is still slow.

Matthew's question is exactly what I was thinking. A recursive chown should not take that long on almost any *NIX filesystem with the directory structure that Solr creates, so we really want to know what kind of filesystem it's on.

If it's a network filesystem (NFS, SMB, S3, Google Drive, or similar) then that might take a long time, as that is the nature of a network filesystem. Doing a find piped through xargs/chown is likely to take almost as long as the recursive chown does.

It does seem likely that you'd have backups on a network filesystem, as that's how the backup feature in Solr is designed to work -- the same filesystem mounted in the same place on all Solr nodes.

Maybe the startup can be changed so it does the chown in the backround instead of holding up the Solr start. I have never used the operator so I don't know anything about it.

Another idea is to remove the chown from the startup and have it run periodically on the system that shares the filesystem over the network. Or do the chown manually once, remove the chown from the startup, and just don't worry about it, because if Solr is the only thing writing to the backup location, everything should have correct permissions.

Thanks,
Shawn

Reply via email to