> > if a commit occurs during the rep-cache.db verification, this can lead to: > > 1. a post-commit error "database is locked" > > 2. new representations will not be added in the rep-cache.db > > 3. deduplication does not work for new data committed at this time > > 4. commits work with delays. > > As I said, you accurately describe the observed behaviour. However, > given the misunderstanding upthread, I would still like to ask you to > make it unambiguously clear which of those four items are requisites of > your use-case.
I'm not sure that I understand the question about the requisites of use case correctly. The use case I am talking about is running verify for a hot repository. And the points listed above are negative consequences that occur in this case. I think that the best way is to fix all these consequences, if possible. The proposed patch does exactly that. > I look to hearing Denis's concerns with the sharding approach. It seems to me that this approach has at least the following potential problems: - There may be considerable difficulties associated with supporting multiple databases. For example, it may be necessary to open not one, but up to all existing databases during a commit, that may affect the performance of the commit. In addition, if we ever need atomic operations on the entire rep-cache, we will have to use ATTACH DATABASE statement [1] with the master journal [2], which I think is not used anywhere now, and is not supported by all journaling modes. - I assume that reading and verification of all entries in one shard will be performed while holding a SQLite lock, because otherwise we return to the variations of the proposed patch and the sharding approach will not be necessary at all. Then, if the main part of the verification (for example, reading the revision content) will be performed while holding the lock, the problem may still occur in some cases, because this verification part can potentially take a long time (for example, if repositories are located on a network share). So the problem will not be completely fixed. - If I'm not mistaken, this approach requires a format bump. So this does not fix the problem for existing repositories. It is also necessary to perform the division into shards in some form, which means that a fast in-place upgrade also probably will not fix the problem. > In this case, the tradeoff would seem to be among: > > - ship 1.14's «verify» and require «build-repcache» to be run afterwards; > > - ship the «verify» in the OP, about whose correctness we are less > certain, but which doesn't require running «build-repcache» afterwards; Speaking about tradeoffs, I would like to note that these cases are not equivalent, when it comes to visible behavior, because the requirement to run build-repcache does not fix 1), 3) and 4). [1] https://www.sqlite.org/lang_attach.html [2] https://www.sqlite.org/tempfiles.html#master_journal_files Regards, Denis Kovalchuk