dsmiley commented on PR #1729: URL: https://github.com/apache/solr/pull/1729#issuecomment-1610655423
Thanks for weighing in Ilan. My proposal here was motivated by a goal of providing something very simple which I hope you can see is a desirable quality. @psalagnac's POC in BackupCmd (i.e. "early" instead of "late") is very complex in tracking and doesn't protect against multiple concurrent backups being issued nor restore I/O. > It can lead to starting execution of a very large number of commands and blocking them all waiting for each other. There is no free lunch if we limit resources -- commands will compete. But I do get your point. For a _very naive_ backup client that might choose to bombard Solr with backup requests, they would all be accepted and be using threads within Solr waiting for a very long time. I suppose a way to wrestle with this decision is to consider what a backup invoking daemon should do to do its job _well_ (not naively). I imagine a cron/timer that loops over all collections to back them up. One-at-a-time would take poor advantage of resources, but we could imagine a client that uses a Semaphore with permits equal to the number of nodes * a rough number simultaneous backups desired. Backup of a collection would grab a number of permits equal to its shard count, limited by the maximum chosen for the Semaphore. The POC here protects against what a backup client can't possibly protect against -- a collection with way more shards than there are nodes. By the way -- that's really rare as f ar as Solr use-cases out there, so the POC is motivated by providing a simple solution to something rare. It would be nice to handle the naive/abuse case better though. It wouldn't be hard to fail the backup if the number of waiters on the Semaphore in this POC is very large (> 100?). Yes it's possible for a partial backup to fail but that's okay. Simple protection vs absolute efficiency / guarantee. I already provided an option for prioritizing restores over backups -- which doesn't seem to favor an "early" vs "late" design. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org