Trying to summarize this thread a bit. I apologize in advance if I forgot something, or have misrepresented any of the points that were raised (feel free to correct / add).
Denis summed up the following problems that might happen while 'verify' locks the repcache.db: > 1. a post-commit error "database is locked" > 2. new representations will not be added in the rep-cache.db > 3. deduplication does not work for new data committed at this time > 4. commits work with delays. We have also established that the new tool build-repcache is not suitable for post-factum fixing of 3). It does not reprocess already committed revisions. We are currently considering two approaches to address these issues: 1) Let verify process the repcache entries in small batches, without holding an sqlite lock (Denis' patch). pro: + Fixes #1 through #4. con: - Relies more heavily on sqlite guarantees that all rows that were present at the start of 'verify' are readable and correct, after verify has finished. SQLite might have subtle bugs in this area, and verify should be as conservative / careful as possible. 2) Shard repcache.db to make the locking window smaller (Daniel's proposal). pro: + Fixes #1 through #4 (if the shard size is 'small enough') con: - If we ever ned atomic operations on the entire repcache, we need to forbid rep-cache.db shards from using WAL mode and use the ATTACH DATABASE statement with the master journal (which is rarely used and is not supported by all journaling modes). - Requires format bump, which means it will only work if the admin has run 'svnadmin upgrade'. - May not fully fix the problem if the shard size is too large and verification of a single shard still takes too much time (e.g. because it's located on a network drive). I'll add one more concern of my own here, regarding the 'sharding' approach: I'd like to warn for the NIHS (Not Invented Here Syndrome) that comes peeking around the corner if we say "SQLite might have subtle bugs that might hurt us if we do X, but rolling our own solution might be better". Why would "rolling our own solution" like sharding repcache.db be less susceptible to such subtle bugs than SQLite? Okay, on the one hand SQLite is more complex, because it's generic database software. But on the other hand it presumably has a lot more users / audience than just Subversion. I have no clear answer here. HTH, -- Johan